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Basis of the report 
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With regard to the elements of the international application:* 






1 — 1 
1 1 


the international application as originally filed. 
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\E\ 


the description, pages 1-6, 8-60 as originally filed, 








pages , filed with the demand, o j 








pages 7 and 7a, filed with the letter of June 1999. 
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the claims, pages 117,199-122, as originally filed, 








pages , as amended (together with any statement) under Article 19, 






pages , filed with the demand, ^ 








pages 116 and 118, filed with the letter of >*f June 


1999. 
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the drawings, pages 1/44-44/44 , as originally filed, 








pages , filed with the demand, 








pages , filed with the letter of . 
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the sequence listing part of the description: 








pages 61-115 , as originally filed 








pages , filed with the demand 








pages , filed with the letter of 
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contained in the international application in written form. 






□ 


filed together with the international application in computer readable form. 
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furnished subsequently to this Authority in written form. 
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furnished subsequently to this Authority in computer readable form. 
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international application as filed has been furnished. 
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determined can also be used as probes for the" identification 
and isolation of corresponding sequences, including promoter 
sequences, from other cereal plant species. 

In its most general aspect, the invention provides 
a nucleic acid sequence encoding an enzyme of the starch 
biosynthetic pathway in a cereal plant, said enzyme being 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme/ with the proviso that 
the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize, and that starch 
branching enzyme II does not have the N-terminal amino acid 
sequence: 

AAS PGKVLVPDGEDDLAS PA . 

Preferably the nucleic acid sequence is a DNA 
sequence, and may be genomic DNA or cDNA. Preferably the 
sequence is one which is functional in wheat. More 
preferably the sequence is derived from a Trlticum species, 
most preferably Triticum tauschii. 

Where the sequence encodes soluble starch 
synthase, preferably the sequence encodes the 75 kD soluble 
starch synthase of wheat. 

Biologically-active untranslated control sequences 
of genomic DNA are also within the scope of the invention. 
Thus the invention also provides the promoter of an enzyme 
as defined above. 

In a preferred embodiment of this aspect of the 
invention, there is provided a nucleic acid construct 
comprising a nucleic acid sequence of the invention, a 
biologically-active fragment thereof, or a fragment thereof 
encoding a biologically-active fragment of an enzyme as 
defined above, operably linked to one or more nucleic acid 
sequences facilitating expression of said enzyme in a plant, 
preferably a cereal plant. The construct may be a plasmid 
or a vector, preferably one suitable for use in the 
transformation of a plant. A particularly suitable vector 
is a bacterium of the genus Agrobacterium, preferably 
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Agrobacterium tumeTaciens . Methods of transforming cereal 
plants using Agrobac ter ium tumefaciens are known; see for 
example Australian Patent No. 667 93 9 by Japan Tobacco Inc., 
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1 . A nucleic acid sequence encoding an enzyme of the 
starch biosynthetic pathway in a cereal plant, wherein the 
enzyme is selected from the group consisting of starch 

5 branching enzyme I, starch branching enzyme II, starch 

soluble synthase I, and debranching enzyme, with the proviso 
that the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize, and that starch 
branching enzyme II does not have the N-terminal amino acid 
10 sequence: 

AAS PGKVLVPDGEDDLAS PA . 

2. A sequence according to claim 1, wherein the 
15 sequence is a genomic DNA or cDNA sequence. 

3. A sequence according to claim 1 or claim 2, wherein 
the sequence is functional in wheat. 

20 4. A sequence according to any one of claims 1 to 3 , 

wherein the sequence is derived from a Triticuin species. 

5. A sequence according to claim 4, wherein the 
Triticum species is Tjriticum tauschii . 

C ■ i - ^ 

6. A sequence according to/^any one of claims 1 to 5, t 

wherein the sequence encodes starch branching enzyme I or a 
biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 5 or SEQ ID NO: 9. 

7. A sequence according to claim 6, wherein the 
homology is at least. 90%. 

35 8. A sequence according to^any one of claims 1 to 5, 

wherein the sequence encodes starch branching enzyme II a or 
biologically-active fragment thereof, and wherein the 
sequence has at least 7 0% sequence homology with the 
sequence shown in SEQ ID NO: 10. 



25 



30 
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biologically-active fragment thereof, and wherein the ~ 
promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No : 8 . 

18. A sequence according to claim 17, wherein the 
homology is at least 90%. 

19. A promoter according to claim 16, wherein the 
promoter is a starch soluble synthase I promoter or 
biologically-active fragment thereof, and wherein the 
promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 15. 

20. A sequence according to claim 19, wherein the 
homology is at least 90%. 

21. A nucleic acid construct comprising a nucleic acid 
sequence encoding an enzyme of the starch biosynthetic 
pathway in a cereal plant, operably linked to one or more 
nucleic acid sequences facilitating expression of the 
nucleic acid sequence in a plant, wherein the enzyme is 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 
the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize, a biologically- 
active fragment thereof, and that starch branching enzyme II 
does not have the N-terminal amino acid sequence: 

AAS PGKVLVPDGEDDLAS PA . 

22. A nucleic acid construct for targeting a gene to 
the endosperm of a cereal plant, comprising one or more 
promoter sequences selected from the group consisting of 
SBE I promoter, SBE II promoter, SSS I promoter, and 

DBE promoter, operatively linked to a nucleic acid sequence 
encoding a protein, wherein the expression of the targetted 
gene in the endosperm of a cereal plant is modified. 
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Intern; 
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application No. 
8/00743 



Basis of the report 



With regard to the elements of the international application:* 
| | the international application as originally filed. 

fx] the description, pages 1-6, 8-60 as originally filed, 

pages , filed with the demand, ^ 

pages 7 and 7a, filed with the letter of June 1999. 

X| the claims, pages 117,199-122, as originally filed, 

pages , as amended (together with any statement) under Article 19, 

pages , filed with the demand, ^ 

pages 116 and 118, filed with the letter of )k June 1999. 

XJ the drawings, pages 1/44-44/44 , as originally filed, 

pages , filed with the demand, 
pages , filed with the letter of . 

the sequence listing part of the description: 

pages 61-115 , as originally filed 
pages , filed with the demand 
pages , filed with the letter of 



X 



With regard to the language, all the elements marked above were available or furnished to this Authority in the language in 

which the international application was filed, unless otherwise indicated under this item. 

These elements were available or furnished to this Authority in the following language which is: 

| | the language of a translation furnished for the purposes of international search (under Rule 23. 1(b)). 
I | the language of publication of the international application (under Rule 48.3(b)). 

| | the language of the translation furnished for the purposes of international preliminary examination (under Rules 55.2 
and/or 55.3). 

With regard to any nucleotide and/or amino acid sequence disclosed in the international application, was on the basis of 
the sequence listing: 

fx] contained in the international application in written form. 

| | filed together with the international application in computer readable form. 

| | furnished subsequently to this Authority in written form. 

| | furnished subsequently to this Authority in computer readable form. 

| 1 The statement that the subsequently furnished written sequence listing does not go beyond the disclosure in the 
international application as filed has been furnished. 

| | The statement that the information recorded in computer readable form is identical to the written sequence listing has 
been furnished 

| | The amendments have resulted in the cancellation of: 
| | the description, pages 
| | the claims, Nos. 
| | the drawings, sheets/fig 

1 1 This report has been established as if (some of) the amendments had not been made, since they have been considered 
to go beyond the disclosure as filed, as indicated in the Supplemental Box (Rule 70.2(c)).** 

Replacement sheets which have been furnished to the receiving Office in response to an invitation under Article 14 are referred to in this 

report as "originally filed" and are not annexed to this report since they do not contain amendments (Rules 70.16 and 70. 1 7). 

Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this report 
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Novelty (N) 

Inventive step (IS) 



Claims 


1-52 


YES 


Claims 


none 


NO 


Claims 


1-52 


YES 


Claims 


none 


NO 



Industrial applicability (LA) Claims 1-52 YES 

Claims none NO 



2. Citations and explanations (Rule 70.7) 



The closest prior art is D6 (Nair et al) as listed on the International Search Report. D6 discloses an N-terminal sequence 
specifically excluded from the claimed enzymes of the present application. The claims are thus considered both novel and 
inventive in light of the prior art. 

The claimed matter is considered industrially applicable. 
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determined can also be used as probes for the identification 
and isolation of corresponding sequences, including promoter 
sequences, from other cereal plant species. 

In its most general aspect, the invention provides 
a nucleic acid sequence encoding an enzyme of the starch 
biosynthetic pathway in a cereal plant, said enzyme being 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 
the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

Preferably the nucleic acid sequence is a DNA 
sequence, and may be genomic DNA or cDNA. Preferably the 
sequence is one which is functional in wheat. More 
preferably the sequence is derived from a Triticum species, 
most preferably Triticum tauschii. 

Where the sequence encodes soluble starch 
synthase, preferably the sequence encodes the 75 kD soluble 
starch synthase of wheat. 

Biologically-active untranslated control sequences 
of genomic DNA are also within the scope of the invention. 
Thus the invention also provides the promoter of an enzyme 
as defined above. 

In a preferred embodiment of this aspect of the 
invention, there is provided a nucleic acid construct 
comprising a nucleic acid sequence of the invention, a 
biologically-active fragment thereof, or a fragment thereof 
encoding a biologically-active fragment of an enzyme as 
defined above, operably linked to one or more nucleic acid 
sequences facilitating expression of said enzyme in a plant, 
preferably a cereal plant. The construct may be a plasmid 
or a vector, preferably one suitable for use in the 
transformation of a plant. A particularly suitable vector 
is a bacterium of the genus Agrobacteri urn, preferably 
Agrobacterium tumefaciens . Methods of transforming cereal 
plants using Agrobacterium tumefaciens are known; see for 
example Australian Patent No. 667939 by Japan Tobacco Inc., 
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CLAIMS 

1- A nucleic acid sequence encoding an enzyme of che 

search biosynnhetic pathway in a cereal plane, wherein che 
5 enzyme is selected from the group consisting of search 
branching enzyme X , search branching enzyme II, starch 
soluble synthase I, and debranching enzyme, with che proviso 
than the enzyme is not soluble search synthase I of rice, or 
starch branching enzyme I of rice or maize. 

10 

2. A sequence according to claim 1, wherein the 
sequence is a genomic DMA or cDNA sequence. 

3. A sequence according to claim 1 or claim 2, 
15 wherein the sequence is functional in wheat. 

4- A sequence according to any one of claims 1 eo 3 , 

wherein the sequence: is derived from a. Tricicom species. 

2 0 5. A sequence according eo claim 4, wherein the 

Trlciaum species is Trizlcum causchii . 

6, A sequence according to . any one of claims 1 eo 5 , 
wherein the sequence encodes search branching enzyme I or a 

25 bioicgically-ac tive fragment thereof, and wherein the 
sequence has &t lease 70% sequence homology with ehe 
sequence shown in SEQ ID NO : 5 or SEQ ID NO : 9 . 

7. A sequence according to claim 6, wherein ehe 
30 homology is at least 90%. 



8. A sequence according eo' any one of claims 1 to 5 , 

wherein ehe sequence encodes starch branching enzyme II a or 
biologically-active fragment thereof, and wherein the 
3 5 sequence has ae lease 70% sequence homology with the 
sequence shown in SEQ ID NO: 10. 
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9- A sequence according to claim 8, wherein the 
homology is at least 90%. 

10- A sequence according to any one of claims 1 to 5 , 
5 wherein the sequence encodes soluble starch synthase or a 

biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 11 or SEQ ID NO: 13. 

10 11. A sequence according to claim 10, wherein the 

homology is at least 90%. 

12- A sequence according to claim 11, wherein the 

sequence encodes a 75 kD soluble starch synthase of wheat. 

15 

13. A sequence according to claim 12, which encodes an 

amino acid sequence at least 70% homologous to that shown in 
SEQ ID NO: 14 . 

20 14. A sequence according to any one of claims 1 to 5, 

wherein the sequence encodes debranching enzyme or a 
biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID No: 17. 

25 

15. A sequence according to claim 14, wherein the 
homology is at least 90%. 

16. A promoter of an enzyme selected from the group 
30 consisting of starch branching enzyme I, starch branching 

enzyme II, starch soluble synthase I, and debranching 
enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize. 

35 

17. A promoter according to claim 16, wherein the 
promoter is a starch branching enzyme I promoter or 



/ 
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biologically-active fragment thereof, and wherein the 
promoter sequence "has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 8. 

18 • A sequence according to claim 17, wherein the 

homology is at least 90%. 



19 • A promoter according to claim 16, wherein the 
promoter is a starch soluble synthase I promoter or 

10 biologically-active fragment thereof, and wherein the 

promoter sequence has at least 70% sequence homology with 
the sequence shown in SEQ ID No: 15. 

20 • A sequence according to claim 19, wherein the 
15 homology is at least 90%. 



20 



25 



21 • A nucleic acid construct comprising a nucleic acid 

sequence encoding an enzyme of the starch biosynthetic 
pathway in a cereal plant, operably linked to one or more 
nucleic acid sequences facilitating expression of the 
nucleic acid sequence in a plant, wherein the enzyme is 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 
the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize, a biologically- 
active fragment thereof. 

22 * A nucleic acid construct for targeting a gene to 

the endosperm of a cereal plant, comprising one or more 
promoter sequences selected from the group consisting of 
SBE I promoter, SBE^II promoter ,' SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 
encoding a protein, wherein the expression of the targetted 
35 gene in the endosperm of a cereal plant is modified. 



30 
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23. A construct according to either claim 21 or claim 
22, wherein the promoter or nucleic acid sequence is also 
operatively linked to one or more additional targeting 
sequences and/or one or more 3* untranslated sequences. 

5 

24. A construct according to claim 23, wherein the 
nucleic acid encoding the protein is either in the sense or 
antisense orientation . 

10 25. A construct according to claims 24, wherein the 

protein is an enzyme of the starch biosynthetic pathway. 



26. A construct according to claim 25, wherein the 
nucleic acid encoding the protein is in the antisense 

15 orientation, and the enzyme is selected from the group 

consisting of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, and grain softness protein I. 

27. A construct according to claim 25, wherein the 
20 nucleic acid encoding the protein is in the sense 

orientation, and the enzyme is selected from the group 
consisting of bacterial isoamylase, bacterial glycogen 
synthase, and wheat high molecular weight glutenin Bxl7 . 

28. A construct according to any one of claims 21 to 
25 27, wherein the plant is a cereal plant. 

29. A construct according to claim 28, wherein the 
cereal plant is either wheat or barley. 

30 30. A construct according to claim 29, wherein the 

cereal plant is wheat. 

31. A construct according to any one of claims 21 to 

30. wherein the construct is either a plasmid or a vector. 



35 
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32 * A construct according to claim 31, wherein the 
plasmid or vector -is 'suitable for use in the transformation 
of a plant. 

33 * A construct according to claim 32, wherein the 
plasmid is selected from the group consisting of those 
depicted in Figures 22a to 22f . 

34 * A construct according to claim 32, wherein the 
vector is a bacterium of the genus A&robacterium. 

35 • A construct according to claim 34, wherein the 
vector is Agrobacterium tumefaciens. 

15 36 * A method of modifying the characteristics of 

starch produced by a plant, comprising the steps of: 

(a) introducing a nucleic acid sequence encoding 
an enzyme of the starch biosynthetic pathway into a host 
plant, and/or 

20 (k) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein the enzyme is selected from the group 
consisting of starch branching enzyme I, starch branching 

25 enzyme II, starch soluble synthase I, and debranching 

enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize, and wherein if both steps (a) and (b) are 
used, the enzymes in the two steps are different. 

30 

37 • A method according to claim 36, wherein the plant 

is a cereal plant. 

38 • A method according to claim 37, wherein the cereal 

35 plant is wheat or barley. 



WO 99/14314 




CT/AU98/00743 



- 121 - 

39. A method of targeting expression of a gene to the 
endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

40. A method of modulating the time of expression of a 
gene in endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

41. A method according to claim 40, wherein when 
expression at an early stage following anthesis is desired, 
the construct comprises either the SBE II, SSS I, or DBE 
promoter . 

42. A method according to claim 40, wherein when 
expression at a later stage following anthesis is desired, 
the construct comprises the SBE I promoter. 

20 43 . A plant transformed with a construct according to 

any one of claims 21 to 35. 

44. A plant according to claim 43, wherein the plant 
is a cereal plant. 

25 

45. A plant according to claim 44, wherein the cereal 
plant is wheat or barley. 

46. A method of identifying variations in the starch 
30 synthesis characteristics of a cereal plant, comprising the 

step of identifying a variation in nucleic acid sequence in 
the intron regions of the SBE I, SBE II, SSS I or DBE genes. 

47. A method of identifying variations in the starch 
35 synthesis characteristics of a cereal plant, comprising the 

step of identifying a variation in nucleic acid sequence 
compared to the sequence shown in one or more SEQ ID NO: 5, 
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SEQ ID NO:7, SEQ ID NO: 9, SEQ ID NO:10, SEQ ID NO:ll, SEQ ID 
NO: 13, SEQ ID NO : 1"5 ,' SEQ ID NO: 16, or SEQ ID NO: 17. 

48. A method according to claim 47, in which a 

5 mutation or absence of a SBE I, SBE II, SSS I or DBE gene is 
detected . 

49- A method according to either claim 47 or claim 48, 

in which the cereal plant is wheat or barley. 

10 50. A product comprising plant material propogated 

from a plant transformed with a nucleic acid sequence 
encoding an enzyme of the starch biosynthetic pathway in a 
cereal plant, operably linked to one or more nucleic acid 
sequences facilitating expression of the nucleic acid 

15 sequence in a plant, wherein the enzyme is selected from the 
group consisting of starch branching enzyme I, starch 
branching enzyme II, starch soluble synthase I, and 
debranching enzyme, with the proviso that the enzyme is not 
soluble starch synthase I of rice, or starch branching 

20 enzyme I of rice or maize, a biologically-active fragment 
thereof . 

51. A product comprising plant material propogated 
from a plant in which a gene was targeted to the endosperm 
of a cereal plant, by a nucleic acid construct comprising 

25 one or more promoter sequences selected from the group 
consisting of SBE I promoter, SBE II promoter, SSS I 
promoter, and DBE promoter, operatively linked to a nucleic 
acid sequence encoding a protein, wherein the expression of 
the targetted gene in the endosperm of a cereal plant is 

30 modified. 

52. A product according to claim 50 or claim 51 
wherein the product is a food product. 
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REGULATION O F GENE EXPRESSION IN PLANTS 

This invention relates to methods of modulating 
the expression of desired genes in plants, and to DNA 
sequences and genetic constructs for use in these methods 
In particular, the invention relates to methods and 
constructs for targeting of expression specifically to the 
endosperm of the seeds of cereal plants such as wheat, and 
for modulating the time of expression in the target tissue. 
This is achieved by the use of promoter sequences from 
enzymes of the starch biosynthetic pathway. In a preferred 
embodiment of the invention, the sequences and/or promoters 
are those of starch branching enzyme I, starch branching 
enzyme II, soluble starch synthase I, and starch debranching 
15 enzyme, all .derived from Triticum tausahii, the D genome 
donor of hexaploid bread wheat. 

A further preferred embodiment relates to a method 
of identifying variations in the characteristics of plants. 

20 BACKGROUND OF THE INVENTION 

Starch is an important constituent of cereal 
grains and of flours, accounting for about 65-67% of the 
weight of the grain at maturity, it is produced in the 
amyloplast of the grain endosperm by the concerted action of 
a number of enzymes, including ADP-Glucose pyrophosphorylase 
(EC 2.7.7.27), starch synthases (EC 2.4.1.21), branching 
enzymes (EC 2.4.1.18) and debranching enzymes (EC 3 2 1 41 
and EC 3.2.1.68) (Ball et al , 1996: Martin and Smith, 1995; 
Morell et al, 1995). Some of the proteins involved in the 
synthesis of starch can be recovered from the starch 
granule (Denyer et al, 1995; Rahman et al, 1995). 

Most wheat cultivars normally produce starch 
containing 25% amylose and 75% amylopectin. Amylose is 
composed of large linear chains of a (1-4) linked a-D- 
glucopyranosyl residues, whereas amylopectin is a branching 
form of cc-glycan linked by a (1-6) linkages. The ratio of 
amylose and amylopectin, the branch chain length and the 
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number of branch chains of amylopectin are the major factors 
which determine the properties of wheat starch. 

Starch with various properties has been widely 
used in industry, food science and medical science. High 
5 amylose wheat can be used for plastic substitutes and in 
paper manufacture to protect the environment; in health 
foods to reduce bowel cancer and heart disease; and in 
sports foods to improve the athletes' performance. High 
amylopectin wheat may be suitable for Japanese noodles, and 

10 is used as a thickener in the food industry. 

Wheat contains three sets of chromosomes (A, B and 
D) in its very large genome of about 10 10 base pairs (bp) . 
The donor of the D genome to wheat is Triticum tauschii , and 
by using a suitable accession of this species the genes from 

15 the D genome can be studied separately (Lagudah et al , 
1991) . 

There is comparatively little variation in starch 
structure found in wheat varieties, because the hexaploid 
nature of wheat prevents mutations from being readily 

20 identified. Dramatic alterations in starch structure are 

expected to require the combination of homozygous recessive 
alleles from each of the 3 wheat genomes, A, B and D. This 
requirement renders the probability of finding such mutants 
in natural or mutagenised populations of wheat very low. 

25 Variation in wheat starch is desirable in order to enable 
better tailoring of wheat starches for processing and end- 
user requirements . 

Key commercial targets for the manipulation of 
starch biosynthesis are: 

30 1. "Waxy" wheats in which amylose content is 

decreased to insignificant levels. This outcome is expected 
to be obtained by eliminating granule-bound starch synthase 
activity. 

2. High amylose wheats, expected to be obtained 
3 5 by suppressing starch branching enzyme-II activity. 

3 . Wheats which continue to synthesise starch 
at elevated temperatures, expected to be obtained by 
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identifying or introducing a gene encoding a heat-stable 
soluble starch synthase. 

4. "Sugary types" of wheat which contain 
increased amylose content and free sugars, expected to be 
5 obtained by manipulating an isoamylase- type debranching 
enzyme . 

There are two general strategies which may be used 
to obtain wheats with altered starch structure: 

(a) using genetic engineering strategies to 

10 suppress the activity of a specific gene, or to introduce a 
novel gene into a wheat line; and 

(b) selecting among existing variation in wheat for 
missing ("null") or altered alleles of a gene in 
each of the genomes of wheat, and combining 

15 these by plant breeding. 

However, in view of the complexity of the gene families, 
particularly starch branching enzyme I (SBE I), without the 
ability to target regions which are unique to genes 
expressed in endosperm, modification of wheat by combination 

20 of null alleles of several enzymes in general represents an 
almost impossible task. 

Branching enzymes are involved in the production 
of glucose a-1,6 branches. Of the two main constituents of 
starch, amylose is essentially linear, but amylopectin is 

2 5 highly branched; thus branching enzymes are thought to be 

directly involved in the synthesis of amylopectin but not 
amylose. There are two types of branching enzymes in plants 
, starch branching enzyme I (SBE I) and starch branching 
enzyme II (SBE II), and both are about 85 kDa in size. At 

3 0 the nucleic acid level there is about 65% sequence identity 

between types I and II in the central portion of the 
molecules; the sequence identity between SBE I from 
different cereals is about 85% overall (Burton et a J , 1995; 
Morell et al , 1995) . 
3 5 In cereals, SBE I genes have so far been reported 

only for rice (Kawasaki et al, 1991; Rahman et al, 1997) . A 
cDNA sequence for wheat SBE I is available on the GenBank 
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-database (Accession No. Y12320; Repellin A., Nair R.B., Baga 
M. , and Chibbar R.N.: Plant Gene Register PGR97-094, 1997). 
As far as we are aware, no promoter sequence for wheat SBE I 
has been reported. 
5 We have characterised an SBE I gene, designated 

wSBE I-D2, from Tritlcum tauschii, the donor of the D genome 
to wheat (Rahman et al , 1997). This gene encoded a protein 
sequence which had a deletion of approximately 65 amino 
acids at the C-terminal end, and appeared not to contain 

10 some of the conserved amino acid motifs characteristic of 
this class of enzyme (Svensson, 1994) . Although wSBE I-D2 
was expressed as mRNA, no corresponding protein has yet been 
found in our analysis of SBE I isoforms from the endosperm, 
and thus it is possible that this gene is a transcribed 

15 pseudogene. 

Genes for SBE II are less well characterised; no 
genomic sequences are available, although SBE II cDNAs from 
rice (Mizuno et al, 1993; Accession No. D16201) and maize 
(Fisher et al, 1993; Accession No. L.08065) have been 

20 reported. In addition, a cDNA sequence for SBE II from 
wheat is available on the GenBank database (Nair et al, 
1997; Accession No. Y11282); although the sequences are very 
similar to those reported herein, there are differences near 
the N-terminal of the protein, which specifies its 

25 intracellular location. No promoter sequences have been 
reported, as far as we are aware. 

Wheat granule-bound starch synthase (GBSS) is 
responsible for amylose synthesis, while wheat branching 
enzymes together with soluble starch synthases are 

30 considered to be directly involved in amylopectin 

biosynthesis. A number, of isoforms of soluble and granule- 
bound starch synthases have been identified in developing 
wheat endosperm (Denyer et al, 1995). There are three 
distinct isoforms of starch synthases, 60 kDa, 75-77 kDa and 

35 100-105 kDa, which exist in the starch granules (Denyer et 
al, 1995; Rahman et al, 1995). The 60 kDa GBSS is the 
product of the wx gene. The 75-77 kDa protein is a wheat 
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soluble starch synthase I (SSSI) which is present in both 
the soluble fraction and the starch granule-bound fraction 
of the endosperm. However, the 100-105 kDa proteins, which 
are another type of soluble starch synthase, are located 
5 only in starch granules (Denyer et al, 1995; Rahman et al, 
1995) . To our knowledge there has been no report of any 
complete wheat SSS I sequence, either at the protein or the 
nucleotide level. 

Both cDNA and genomic DNA encoding a soluble 
10 starch synthase I of rice have been cloned and analysed 

(Baba et al, 1993; Tanaka et al, 1995). The cDNAs encoding 
potato soluble starch synthase SSSII and SSSIII and pea 
soluble starch synthase SSSII have also been reported 
(Edwards et al, 1995; Marshall et al, 1996; Dry et al, 
15 1992). However, corresponding full length cDNA sequences . for 
wheat have hitherto not been available, although a partial 
cDNA sequence (Accession No. U48227) has been released to 
the GenBank database. 

Approach (b) referred to above has been 
20 demonstrated for the gene for granule-bound starch synthase. 
Null alleles on chromosomes 7A, 7D and 4A were identified by 
the analysis of GBSS protein bands by electrophoresis, and 
combined by plant breeding to produce a wheat line 
containing no GBSS, and no amylose (Nakamura et al, 1995) . 
25 Subsequently, PGR-based DNA markers have been identified, 
which also identify null alleles for the GBSS loci on each 
of the three wheat genomes. Despite the availability of a 
considerable amount of information in the prior art, major 
problems remain. Firstly, the presence of three separate 
30 sets of chromosomes in wheat makes genetic analysis in this 
species extraordinarily complex. This is further 
complicated by the fact that a number of enzymes are 
involved in starch synthesis, and each of these enzymes is 
itself present in a number of forms, and in a number of 
35 locations within the plant cell. Little, if any, 

information has been available as to which specific form of 
each enzyme is expressed in endosperm. For wheat, a limited 
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amount of nucleic acid sequence information is available, 
but this is only cDNA sequence; no genomic sequence, and 
consequently no information regarding promoters and other 
control sequences, is available. Without being able to 
5 demonstrate that the endosperm-specific gene within a family 
has been isolated, such sequence information is of limited 
practical usefulness . 

SUMMARY OF THE INVENTION 

1° In this application we report the isolation and 

identification of novel genes from T . tauschii , the D-genome 
donor of wheat, that encode SBE I, SBE II, a 75 kDa SSS I, 
and an isoamylase- type debranching enzyme (DBE) . Because of 
the very close relationship between T. tauschii and wheat, 

15 as discussed above, results obtained with T . tauschii can be 
directly applied to wheat with little if any modification. 
Such modification as may be required represents routine 
trial and error experimentation. Sequences from these genes 
can be used as probes to identify null or altered alleles in 

2 0 wheat, which can then be used in plant breeding programmes 

to provide modifications of starch characteristics. The 
novel sequences of the invention can be used in genetic 
engineering strategies or to introduce a desired gene into a 
host plant, to provide antisense sequences for suppression 
25 of one or more specific genes in a host plant, in order to 
modify the characteristics of starch produced by the plant. 

By using T. tauschii , we have been able to examine 
a single genome, rather than three as in wheat, and to 
identify and isolate the forms of the starch synthesis genes 

3 0 which are expressed in endosperm. By addressing genomic 

sequences we have been able to isolate tissue-specific 
promoters for the relevant genes, which provides a mechanism 
for simultaneous manipulation of a number of genes in the 
endosperm. Because T. tauschii is so closely related to 
3 5 wheat, results obtained with this model system are directly 
applicable to wheat, and we have confirmed this 
experimentally. The genomic sequences which we have 
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determined can also be used as probes for the identification 
and isolation of corresponding sequences, including promoter 
sequences, from other cereal plant species. 

In its most general aspect, the invention provides 
5 a nucleic acid sequence encoding an enzyme of the starch 
biosynthetic pathway in a cereal plant, said enzyme being 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 
10 the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

Preferably the nucleic acid sequence is a DNA 
sequence, and may be genomic DNA or cDNA. Preferably the 
sequence is one which is functional in wheat. More 
15 preferably the sequence is derived from a Triticum species, 
most preferably Triticum tauschii . 

Where the sequence encodes soluble starch 
synthase, preferably the sequence encodes the 75 kD soluble 
starch synthase of wheat. 
20 Biologically-active untranslated control sequences 

of genomic DNA are also within the scope of the invention. 
Thus the invention also provides the promoter of an enzyme 
as defined above. 

In a preferred embodiment of this aspect of the 
25 invention, there is provided a nucleic acid construct 
comprising a nucleic acid sequence of the invention, a 
biologically-active fragment thereof, or a fragment thereof 
encoding a biologically-active fragment of an enzyme as 
defined above, operably linked to one or more nucleic acid 
30 sequences facilitating expression of said enzyme in a plant, 
preferably a cereal plant. The construct may be a plasmid 
or a vector, preferably one suitable for use in the 
transformation of a plant. A particularly suitable vector 
is a bacterium of the genus Agrobacterium, preferably 
B5 Agrobacterium tumefaciens . Methods of transforming cereal 
plants using Agrobacterium tumefaciens are known; see for 
example Australian Patent No. 667939 by Japan Tobacco Inc., 
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International Patent Application Number PCT/US97 / 10621 by 
Monsanto Company and Tingay et al (1997) 

In a second aspect, the invention provides a 
nucleic acid construct for targeting of a desired gene to 
5 endosperm of a cereal plant, and/or for modulating the time 
of expression of a desired gene in endosperm of a cereal 
plant, comprising one or more promoter sequences selected 
from SBE I promoter, SBE II promoter, SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 

10 encoding a desired protein, and optionally also operatively 
linked to one or more additional targeting sequences and/or 
one or more 3 1 untranslated sequences. 

The nucleic acid encoding the desired protein may 
be in either the sense orientation or in the antisense 

15 orientation.. Preferably the desired protein is an enzyme of 
the starch biosynthetic pathway. For example, the antisense 
sequences of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, or grain softness protein I, may 
be used. Preferred sequences for use in sense orientation 

2 0 include those of bacterial isoamylase, bacterial glycogen 

synthase, or wheat high molecular weight glutenin Bxl7 . It 
is contemplated that any desired protein which is encoded by 
a gene which is capable of being expressed in the endosperm 
of a cereal plant is suitable for use in the invention. 
25 In a third aspect, the invention provides a method 

of modifying the characteristics of starch produced by a 
plant, comprising the step of: 

(a) introducing a gene encoding a desired enzyme 
of the starch biosynthetic pathway into a host plant, and/or 
30 (t>) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein said enzymes are as defined above. 

Where both steps (a) and (b) are used, the enzymes 

3 5 in the two steps are different. 

Preferably the plant is a cereal plant, more 
preferably wheat or barley. 
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As is well known in the art, anti -sense sequences 
can be used to suppress expression of the protein to which 
the anti-sense sequence is complementary. It will be 
evident to the person skilled in the art that different 
combinations of sense and anti-sense sequences may be chosen 
so as to effect a variety of different modifications of the 
characteristics of the starch produced by the plant. 

In a fourth aspect, the invention provides a 
method of targeting expression of a desired gene to the 
endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to the 
invention. 

According to a fifth aspect, the invention 
provides a method of modulating the time of expression of a 
15 desired gene in endosperm of a cereal plant, comprising the 
step of transforming the plant with a construct according to 
the second aspect of the invention. 

Where expression at an early stage following 
anthesis is desired, the construct preferably comprises the 
SBE II, SSS I or DBE promoters. Where expression at a later 
stage following anthesis is desired, the construct 
preferably comprises the SBE I promoter. 

While the invention is described in detail in 
relation to wheat, it will be clearly understood that it is 
25 also applicable to other cereal plants of the family 
Gramineae, such as maize, barley and rice. 

Methods for transformation of monocotyledonous 
plants such as wheat, maize, barley and rice and for 
regeneration of plants from protoplasts or immature plant 
embryos are well known in the art. See for example Lazzeri 
et al, 1991; Jahne et al , 1991 and Wan and Lemaux, 1994 for 
barley; Wirtzens et al, 1997; Tingay et al , 1991; Canadian 
Patent Application No. 2092588 by Nehra; Australian Patent 
Application No. 61781/94 by National Research Council of 
3 5 Canada, Australian Patent No. 66793 9 by Japan Tobacco Co, 
and International Patent Application Number PCT/US97 / 10 621 
by Monsanto Company. 
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The sequences of ADP glucose pyrophosphorylase 
from barley (Australian Patent Application No. 65392/94), 
starch debranching enzyme and its promoter from rice 
(Japanese Patent Publication No. Kokai 6261787 and Japanese 
Patent Publication No. Kokai 5317057), and starch 
debranching enzyme from spinach and potato (Australian 
Patent Application No. 44333/96) are all known. 

Detailed D escription of the Drawings 

The invention will be described in detail by 
reference only to the following non-limiting examples and to 
the figures . 

Figure 1 shows the hybridisation of genomic clones 
isolated from T. tauschii. 

15 DNA was extracted from the different clones, 

digested with BamHI and hybridised with the 5' end of the 
maize SBE I cDNA . Lanes 1, 2, 3 and 4 correspond to DNA 
from clones XE1 , XE2 . XE6 and XE7 respectively. Note that 
clones X.E1 and \E2 give identical patterns, the SBE I gene 
in A.E6 is a truncated form of that in tel. and AE7 gives a 
clearly different pattern. 

Figure 2 shows the hybridisation of DNA from 
T. tauschii. 

DNA from T . tauschii was digested with BamHI and 
25 the hybridisation pattern compared with DNA from A.E1 and \E1 
digested with the same enzyme. Fragment El . 1 (see Figure 3) 
from XE1 was used as the probe; it contains some sequences 
that are over 80% identical to sequences in E7 . 8 . 
Approximately 2 5 jig of T . tauschii DNA was electrophoresed 
in lane 1, and 2 00 pg each of XE1 and >,E7 in lanes 2 and 3 , 
respectively. 

Figure 3 shows the restriction maps of clone A.E1 
and A.E7. The fragments obtained with EcoRI and BamHI are 
indicated. The fragments sequenced from XE1 are El . 1 , El . 2 , 
a part of El . 7 and a part of El . 5 . 

Figure 4 shows the comparison of deduced amino 
acid sequence of wSBE I-D4 cDNA with the deduced amino acid 



20 
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sequence of rice SBE I (RSBE I; Nakamura et al, 1992), maize 
SBE I ( MSBE I; Baba et al , 1991), wSBE I-D2 type cDNA (D2 
cDNA; Rahman et al , 1997), pea SBE II (PESBE II, homologous 
to maize SBE I; Burton et al, 1995), and potato SBE I 
5 (POSBE; Cangiano et al , 1993) . The deduced amino acid 
sequence of the wSBE I-D4 cDNA is denoted by "D4cDNA" . 
Residues present in at least three of the sequences are 
identified in the consensus sequence in capitals. 

Figure 5 shows the intron-exon structure of 
10 wSBE I-D4 compared to the corresponding structures of rice 
SBE I (Kawasaki et al , 1993) and wSBE I-D2 (Rahman et al, 
1997) . The intron-exon structure of wSBE I-D4 is deduced by 
comparison with the SBE I cDNA reported by Repellin et al 
(1997) . 

15 The dark rectangles correspond to exons and the 

light rectangles correspond to introns . The bars above the 
structures indicate the percentage identity in sequence 
between the indicated exons and introns of the relevant 
genes. Note that intron 2 shares no significant sequence 

20 identity and is not indicated. 

Figure 6 shows the nucleotide sequence of part of 
wSBE I-D4, the amino acid sequence deduced from this 
nucleotide sequence, and the N- terminal amino acid sequence 
of the SBE I purified from the wheat endosperm (Morell et 

25 al, 1997) . 

Figure 7 shows the hybridisation of SBE I genomic 
clones with the following probes, 

A. wSBE I-D45 (derived from the 5 ! end of the 
gene and including sequence from fragments El . 1 and El. 7), 

3 0 and 

B. wSBE I-D43 (derived from the 3 f end of the 
gene and containing sequences from fragment El. 5). For 
panel A, the tracks 1-13 correspond to clones A.E1, A.E2 , A.E6 , 
\E1 , A.E9, XE14, XE22, \E21 , Molecular weight markers, 2lE29, 

35 >.E30, X.E31 and A,E52 . For panel B, tracks 1-12 correspond to 
clones A,E1, ?iE2 , X,E6 , \El , XE9 , A.E14, XE22 , XE27, XE29 , 
A.E3 0, A,E31 and A.E52 . Note that clones XE7 and \E22 do not 
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hybridise to either of the probes and are wSBE I-D2 type 
genes. Also note that clone A.E30 contains a sequence 
unrelated to SBE I. The size of the molecular weight 
markers in kb is indicated. Clones \El and XE22 do 
5 hybridise with a probe from El . 1 . which is highly conserved 
between wSBE I-D2 and wSBE I-D4. 

Figure 8 shows the alignment of cDNA clones to 
obtain the sequence represented by wSBE I-D4 cDNA. BED4 and 
BEDS were obtained from screening the cDNA library with 

10 maize BEI (Baba et al , 1991) . BED1 , 2 and 3 were obtained 
by RT-PCR using defined primers. 

Figure 9a shows the expression of Soluble Starch 
Synthase I (SSS) , Starch Branching Enzyme I (BE I) and 
Starch Branching Enzyme II (BE II) mRNAs during endosperm 

15 development. 

RNA was purified from leaves, florets prior to 
anthesis, and endosperm of wheat cultivar Rosella grown in a 
glasshouse, collected 5 to 8 days after anthesis, 10 to 15 
days after anthesis and 18 to 22 days after anthesis, and 

2 0 from the endosperm of wheat cultivar Rosella grown in the 
field and collected 12, 15 and 18 days after anthesis 
respectively. Equivalent amounts of RNA were 
electrophoresed in each lane. The probes were from the 
coding region of the SM2 SSS I cDNA (from nucleotide 1615 to 

25 1919 of the SM2 cDNA sequence) ; wSBE I-D43C (see Table I) , 
which corresponds to the untranslated 3' end of wSBE I-D4 
cDNA (El (3'; and the 5 f region of SBE9 (SBE9 (5'), 
corresponding to the region between nucleotides 743 to 1004 
of Genbank sequence Y11282 . No hybridisation to RNA 

30 extracted from leaves or preanthesis florets was detected. 

Figure 9b shows the hybridisation of RNA from the 
endosperm of the hexaploid T . aestivum cultivar "Gabo" with 
the starch branching enzyme I gene. The probe, WSBEI-D43, is 
defined in Table 1. 

35 Figure 9c shows the hybridisation of RNA from the 

endosperm of the hexaploid T. aestivum cultivar u Wyuna" with 
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the starch branching enzyme II gene. The probe, wSBE II-D13, 
is defined in Table 2 . 

Figure 9d shows the hybridisation of RNA from the 
endosperm of the hexaploid T. aestivum cultivar "Gabo" with 
5 the SSS I gene. The probe spanned the region from 

nucleotides 2025 to 2497 of the SM2 cDNA sequence shown in 
SEQ ID No: 11. 

Figure 9e shows the hybridisation of RNA from the 

endosperm of the hexaploid T . aestiwm cultivar "Gabo" with 
10 the DBE I gene. The probe, a DBE3 ' 3 ' PGR fragment, extends 

from nucleotide position 281 to 1072 of the cDNA sequence in 

SEQ ID No : 16 . 

Figure 9f shows the hybridisation of RNA from the 

endosperm of the hexaploid T . aestivum cultivar "Gabo" with 
15 the wheat actin gene. The probe was a wheat actin DNA 

sequence generated by PCR from wheat endosperm cDNA using 

primers to conserved plant actin sequences. 

Figure 9g shows the hybridisation of RNA from the 

endosperm of the hexaploid T. aestivxun cultivar "Gabo" with 
2 0 a probe containing wheat ribosomal RNA 2 6S and 18S fragments 

(plasmid pta250.2 from Dr Bryan Clarke, CSIRO Plant 

Industry) . 

Figure 9h shows the hybridisation of RNA from the 
hexaploid wheat cultivar "Gabo" with the DBE I probe 
25 described in Figure 9e. Lane 1; leaf RNA; lane 2, pre- 

anthesis floret RNA; lane 3, RNA from endosperm harvested 12 
days after anthesis. 

Figure 10 shows the comparison of wSBE I-D4 
(sr 427. res ck: 6,362,1 to 11,099) and rice SBE I genomic 
30 sequence (dl0838 . em_pl ck: 3,071,1 to 11 , 700 ) (Kawasaki et 
al, 1993; Accession Number D10838) using the programs 
Compares and DotPlot (Devereaux et a J , 1984) . The programs 
used a window of 21 bases with a stringency of 14 to 
register a dot. 

35 Figure 11 shows the hybridisation of wheat DNA 

from chromosome-engineered lines using the following probes: 
A. wSBE I-D45 (from the 5 r end of the gene), 
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B. wSBE I-D43 (from the 3' end of the gene), 



C. wSBE I-D4R (repetitive sequence 

approximately 600 bp 3 1 to the end of wSBE I-D4 sequence. 

N7AT7B, no 7A chromosome, four copies of 7B 
chromosome; N7BT7D, no 7B chromosome, four copies of 7D 
chromosome; NTDT7A, no 7D chromosome, four copies of 7 A 
chromosome. The chromosomal origin of hybridising bands is 
indicated. 

Figure 12 shows the hybridisation of genomic 
clones Fl, F2, F3 and F4 with the entire SBE-9 sequence. 
The DNA from the clones was purified and digested with 
either BairiHI or EcoRI , separated on agarose, blotted onto 
nitrocellulose and hybridised with labelled SBE-9 (a SBE 11 
15 type cDNA) . The pattern of hybridising bands is different 
in the four isolates. 

Figure 13a shows the N-terminal sequence of 
purified SBE II from wheat endosperm as in Morell et al , 
(1997) . 

20 Figure 13b shows the deduced amino acid sequence 

from part of wSBE II-D1 that encodes the N-terminal sequence 
as described in Morell et al, (1997) 

Figure 14 shows the deduced exon-intron structure 
for a part of wSBE II-D1. The scale is marked in bases. 
25 The dark rectangles are exons . 

Figure 15 shows the hybridisation of DNA from 
chromosome engineered lines of wheat (cultivar Chinese 
Spring) with a probe from nucleotides 550-850 from SBE-9 . 
The band of approximately 2.2 kb is missing in the line in 
3 0 which chromosome 2D is absent. 
T2BN2A: 
of chromosome 2A, 
T2AN2B; 
of chromosome 2B 
3 5 T2AN2D 
of chromosome 2D 



four copies of chromosome 2B, no copies 
four copies of chromosome 2A, no copies 
four copies of chromosome 2A, no copies 
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Figure 16 shows the N-terminal sequence of SSS I 
protein isolated from starch granules (Rahman et al , 1995) 
and deduced amino acid sequence of part of Sm2 . 

Figure 17 shows the hybridisation of genomic 
clones sgl, 3, 4, 6 and 11 with the cDNA clone (sm2) for SSS 
I. DNA was purified from indicated genomic clones, digested 
with BamHI or SacI and hybridised to sm2 . Note that the 
hybridisation patterns for sgl, 3 and 4 are clearly 
different from each other. 

Figure 18 shows a comparison of the intron/exon 
structures of the wheat and rice soluble starch synthase 
genomic sequences. The dark rectangles indicate exons and 
the light rectangles represent introns . 

Figure 19 shows the hybridisation of DNA from 
15 chromosome engineered lines of wheat (cultivar Chinese 
Spring) digested with PvuII, with the sm2 probe. 

N7AT7B: no 7A chromosome, four copies of 7B 
chromosome; 

N7BT7D: no 7B chromosome, four copies of 7D 

2 0 chromosome; 

N7DT7A: no 7D chromosome, four copies of 7A 
chromosome . 

A band is missing in the N7BT7A line. 

Figure 20a shows the DNA sequence of a portion of 
25 the wheat debranching enzyme (WDBE-l)PCR product. The 

PCR product was generated from wheat genomic DNA (cultivar 
Rosella) using primers based on sequences conserved in 
debranching enzymes from maize and rice. 

Figure 20b shows a comparison of the nucleotide 

3 0 sequence of wheat debranching enzyme I (WDBE-I) PCR fragment 

(WHEAT , DNA) with the maize Sugary-1 sequence ( SUGARY . DNA) . 

Figure 2 0c shows a comparison between the 
intron/exon structures of wheat debranching enzyme gene and 
the maize sugary-1 debranching enzyme gene. 
35 Figure 21a shows the results of Southern blotting 

of T. tauschii DNA with wheat DBE-I PCR product. DNA from 
T. tauschii was digested with BamHI , electrophoresed, 
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blotted and hybridised to the wheat DBE-I PCR product 
described in Figure 20a. A band of approximately 2 kb 
hybridised. 

Figure 21b shows Chinese Spring nullisomic/ 
5 tetrasomic lines probed with probes from the DBE gene. Panel 

(I) shows hybridisation with a fragment spanning the region 
from nucleotide 270 to 465 of the cDNA sequence shown in SEQ 
ID No: 16 from the central region of the DBE gene. Panel 

(II) shows hybridisation with a probe from the 3' region of 
10 the gene, from nucleotide 281 to 1072 of the cDNA sequence 

given in SEQ ID No: 16. 

Figures 22a to 22e show diagrammatic 
representations of the DNA vectors used for transient 
expression analysis. In each of the sequences the N-terminal 
15 methionine encoding ATG codon is shown in bold. 

Figure 22a shows a DNA construct pwssslprolgf pNOT 
containing a 1042 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIprol, from -1042 to -1, SEQ 
ID No: 18) fused to the green fluorescent protein (GFP) 
20 reporter gene. 

Figure 22b shows a DNA construct pwsssIpro2gf pNOT 
containing a 3914 base pair region of the wheat soluble 
starch synthase I promoter (wSSSIpro2 , from -3914 to -1, SEQ 
ID No: 18) fused to the green fluorescent protein (GFP) 
25 reporter gene. 

Figure 22c shows a DNA construct psbellprolgf pNOT 
containing an 1203 base pair region of the wheat starch 
branching enzyme II promoter (sbellprol, from 1 to 1023 SEQ 
ID No: 10 fused to the green fluorescent protein (GFP) 
3 0 reporter gene. 

Figure 22d shows a DNA construct psbeIIpro2gf pNOT 
containing a 1353 base pair region of the wheat starch 
branching enzyme II promoter and transit peptide coding 
region (sbeIIpro2 / regions 1-1203, 1204 to 1336 and 1664 to 
35 1680 of SEQ ID No: 10 fused to the green fluorescent protein 
(GFP) reporter gene. 

Figure 22e shows a DNA construct pact_j sgf g_nos 
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containing the plasmid backbone of pSP72 (Promega), the rice 
ActI actin promoter (McElroy et al . 1991), the GFP gene 
(Sheen et al . 1995) and the Agrobacterium tumefaciens 
nopaline synthase (nos) terminator (Bevan et al . 1983). 
5 Figure 23 shows T DNA constructs for stable 

transformation of rice by Agrobacterium. The backbone for 
each plasmid is p35SH-iC (Wang et al 1997). The various 
promoter-GFP-Nos regions inserted are shown in (a), (b) , (c) 
and (d) respectively, and are described in detail in Example 

10 24. Each of these constructs was inserted into the NotI 

site of p35SH-iC using the Notl flanking sites at each end 
of the promoter-GFP-Nos regions. The constructs were named 
(a) p35SH-iC-BEIIprol_GFP_Nos, (b) p3 5SH-iC-BEIIpro2_GFP_Nos 
(c) p3 5SH-iC-SSIprol_GFP_Nos and (d) p35SH-iC- 

15 SSIpro2_GFP_Nos 

Figure 24 illustrates the design of 15 intron- 
spanning BE II primer sets. Primers were based on 
wSBE II-D1 sequence (SEQ ID No:10), and were designed such 
that intron sequences in the wSBE II-D1 sequence (deduced 

20 from Figure 13b and Nair et al, 1997; Accession No. Y11282) 
were amplified by PCR. 

Figure 25 shows the results of amplification using 
the SBE II-Intron 5 primer set (primer set 6: sr913F and 
WBE2E6 R) on various diploid, tetraploid and hexaploid 

2 5 wheats . 

i) T.boeodicum (A genome diploid) 

ii) T. tauschii (D genome diploid) 

iii ) T. aestivum cv. Chinese Spring ditelosomic line 
2 AS (lacking chromosome arm 2AL) 

30 iv) Crete 10 (AABB tetraploid) 

v) T. aestivum cv Rosella (hexaploid) 
The horizontal axis indicates the size of the 
product in base pairs, the vertical axis shows arbitrary 
fluorescence units. The various arrows indicate the products 

3 5 of different genomes: A, A genome, B, B genome, D, D genome, 

U, unassigned additional product. 
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Figure 2 6 shows the results obtained by 
amplification using the SBE II-Intron 10 primer set (primer 
set 11: daS.seq and WBE2E11R on the wheat lines: 

(i) T. aestiwm cv. Chinese Spring ditelosomic line 
5 2AS. 

(ii) T. aestiwm Chinese Spring 
nullisomic/tetrasomic line N2BT2A. 

(iii) T. aestivum Chinese Spring 
nullisomic/tetrasomic line N2DT2B. 

10 The horizontal axis indicates the size of the 

product in base pairs, the vertical axis shows arbitrary- 
fluorescence units. The various arrows indicate the products 
of different genomes: A, A genome, B, B genome, D, D genome. 
Figure 27 shows the results of transient 

15 expression assays typical of each promoter and target 
tissue. The photographs (40 x magnification) of 
representative tissue resulting from the transient 
expression assays typical of each promoter and target tissue 
revealed under a Leica microscope with blue light 

20 illumination. Photographs were taken 48 to 72 hours after 
tissue bombardment. The promoter constructs are listed as 
follows, (with the panels showing endosperm, embryo and leaf 
expression listed in respective order) : pact__j sgf p_nos 
(panels a,g and m) ; pwssslprolgf pNOT (panels b, h and n) ; 

25 pwsssIpro2gfpNOT (panels c, i and o); psbellprolgf pNOT 

(panels d, j and p) ; psbeIIpro2gf pNOT (panels e, k and q) ,* 
pZLgf pNOT (Panels f , 1 and r) . 



Example 1 Identification of Gene Encoding SBE 1 

Construction of Genomic Library and Isolation of Clones 

The genomic library used in this study was 
constructed from Triticum tauschii , var . strangulata, 
accession number CPI 100799. Of all the accessions of 
T. tauschii surveyed, the genome of CPI 100799 is the most 
closely related to the D genome of hexaploid wheat. 
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Triticum tauschii, var strangulata (CPI accession 
number 110 799, was kindly provided by Dr £ Lagudah ™ 
were related from plants grown in the glasshouse. 

DNA was extracted from leaves of Triticum tauschii 
us lng publxshed methods (Lagudah et al , 1991) , partially 
digested with Sau3A, si 2e fractionated and ligated to the 
arms of lambda GEM 12 (Promega) . The ligated products were 
used to transfect the mediation- tolerant strain PMC 103 
(Doherty et al . 1992) . A total of 2 x 10* primary plaques 

I 6 ;: tit with an average insert ° f — - 

Thus the library contains approximately 6 genomes worth of 
T Causes DNA The library was amplified and stored at 
4 C until required. 

Positive plaques in the genomic library were 

starcTb " T° Se hybridiS1 " g " ith «» S- end of a maize 
starch branching enzyme I cdna (Baba et al, 1991, using 

" U9 t 9 e 7 K SCrin9ent C ° ndiCiOTS - in Rahman et 

Preparation of Total Ma from wheat 

Total RNA was isolated from leaves, pre-anthesis 
pericarp and different developmental stages of wheat 
endosperm of the eultivar, Hartog and Rosella. This 

25 ZTlT "'J" C ° lleCEed £ " m b ° th th. glasshouse and the 

field. The method used for w isolation was essentially 
the same as that described by Higglns et al ,1976,. RK* „ a s 
then quantified by m absorption and by separation in 
1.4% agarose-formaldehyde gels which were then visualized 
under uv light after staining with ethidium bromide 
(Sambrook et al , 1989) . 
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DNA and RNA analysis 

DNA was isolated and analysed using established 
protocols (Sambrook et al lQffQi rva* 

"heat lev. Chinese Sori™,' ' eXtra " ed £ ™» 

et al USln9 publish ^ methods (Lagudah 

I 1 ' 19 ! l S ° Uthern aMlySis — P-formed essentially 
as described by Jolly et al (1996, . Briefly, 20 M3 wheat 
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DNA was digested, electrophoresed and transferred to a nylon 
membrane. Hybridisation was conducted at 42°C in 25% or 
50% formamide, 2 x SSC, 6% Dextran Sulphate for 16h and the 
membrane was washed at 60°C in 2 x SSC for 3 x lh unless 
otherwise indicated. Hybridisation was detected by 
autoradiography using Fuji X-Omat film. 

RNA analysis was performed as follows. 10 ug of 
total RNA was separated in a 1.4% agarose- formaldehyde gel 
and transferred to a nylon Hybond N + membrane (Sambrook et 
al, 1989 ), and hybridized with cDNA probe at 42°C in 
Khandjian hybridizing buffer (Khandjian, 1989). The 3' part 
of wheat SEE I cDNA (designated wSBE I-D43, see Table 1) was 
labelled with the Rapid Multiprime DNA Probe Labelling Kit 
(Amersham) and used as probe. After washing at 60°C with 
2 x SSC, Q.1% SDS three times, each time for about 1 to 
2 hours, the membrane was visualized by overnight exposure 
at -80°C with X-ray film, Kodak MR. 

Example 2 Frequ ency of Recovery o f SBE I Type ri n . PC 

from the Genomic Library 
An estimated 2 x 10 plaques from the amplified 
library were screened using an EcoRT fragment that contained 
1200 bp at the 5' end of maize SBE I (Baba et al, 1991) and 
twelve ^dependent isolates were recovered and purified 
This corresponds to the screening of somewhat fewer than the 
2 x 10 primary plaques that exist in the original library 
(each of which has an average insert size of 15 kb) 
(Maniatis et aJ, 1982), because the amplification may lead 
to the representation of some sequences more than others 
Assuming that the amplified library contains approximately 
three genomes of T. tauschii, the frequency with which 
SBE I- P ositive clones were recovered suggests the existence 
of about 5 copies of SBE I type genes within the T. tauschii 
genome . 

Digestion of DNA from the twelve independent 
isolates by the restriction endonuclease BamHI followed by 
hybridisation with a maize SBE I clone, suggested that the 
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genomic clones could be separated into two broad classes 
(Figure 1) . One class had 10 members and a representative 
from this class is the clone A.E1 (Figure 1, lane 1); XE6 
(Figure 1, lane 3) is a member of this class, but is missing 
5 the 5' end of the El-SBE I gene because the SBE I gene is at 
the extremity of the cloned DNA. Further hybridisation 
studies at high stringency with the extreme 5 1 and 

3 ' regions of the SBE I gene contained in A,E1 suggested that 
the other clones contained either identical or very closely 

10 related genes . 

The second family had two members, and of these 
clone XE1 (Figure 1, lane 4) was arbitrarily selected for 
further study. These two members did not hybridise to 
probes from the extreme 5 f and 3' regions of the SBE I gene 

15 that were contained in A.E1, indicating that they were a 
distinct sub-class. 

The DNA from T. tauschii and the lambda clones XE1 
and A.E7 was digested with BamHI and hybridised with 
fragment El . 1 , as shown in Figure 2. This fragment contains 

20 sequences that are highly conserved (85% sequence identity 
over 0.3 kB between A.E1 and A.E7 ) , corresponding to exons 3, 

4 and 5 of the rice gene. The bands in the genomic DNA at 
0.8 kb and 1.0 kb correspond to identical sized fragments 
from XE1 and A.E7 , as shown in Figure 2; these are 

25 fragments El . 1 and E7 . 8 of XE1 and XE1 genomic clones 

respectively. Thus the arrangement of genes in the genomic 
clones is unlikely to be an artefact of the cloning 
procedure. There are also bands in the genomic DNA of 
approximately 2.5 kb, 4 . 8 kb and 8 kb in size which are not 

30 found from the digestion of XE1 or XE1 ; these could 

represent genes such as the 5' sequences of wSBE I-Dl or 
wSBE I-D3; see below. 

Example 3 Tandem Arrangement of SBE I Type Genes in 

3 5 the T. tauschii Genome 

Basic restriction endonuclease maps for A,E1 and 
XE7 are shown in Figure 3 . The map was constructed by 
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performing a series of hybridisations of EooRI or BamHI 
digested DNA from XE1 or XE1 . The probes used were the 
fragments generated from BamHI digestion of the relevant 
clone. Confirmation of the maps was obtained by PCR 
analysis, using primers both within the insert and also from 
the arms of lambda itself. PC R was performed in 10 jtl 
volume using reagents supplied by Perkin-Elmer . The primers 
were used at a concentration of 20 UM. The program used was 
94°C, 2 mm, 1 cycle, then 94°C, 3 0 sec; 55°C, 3 0 sec; 72°C, 
lmin for 36 cycles and then 72°C, 5 min ; 25°C, 1 min.' 

Sequencing was performed on an ABI sequencer using 
the manufacturer's recommended protocols for both dye primer 
and dye terminator technologies . Deletions were carried out 
using the Erase-a-base kit from Promega. 

Sequence analysis was carried out using the GCG 
version 7 package of computer programs (Devereaux et al 
1984) . 

The PCR products were also used as hybridisation 
probes. The positioning of the genes was derived from 
sequencing the ends of the BamHI subclones and also from 
sequencing PCR products generated from primers based on the 
msert and the lambda arms. The results indicate that there 
is only a single copy of a SEE I type gene within XE1 . 
However, it is clear that XeI resulted from the cloning of a 
DNA fragment from within a tandem array of the SBE I type 
genes. Of the three genes in the clone, which are named as 
WSBE i-Dl, WSBE I-D2 and wSBE I-D3); only the central one 
(wSBE I-D2) is complete. 

30 Example 4 Constr uction and Screening of C DNA Library 

A wheat cDNA library was constructed from the 
cultivar Rosella using pooled RNA from endosperm at 8, 12, 
18 and 2 0 days after an thesis. 

The cDNA library was prepared from poly A + RNA 
that was extracted from developing wheat grains (cv. 
Rosella, a hexaploid soft wheat cultivar) at 8, 12, 15 18 
21 and 30 days after anthesis. The RNA was pooled' and' used 
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to synthesise cDNA that was propagated in lambda ZapII 
(Stratagene) . 

The library was screened with a genomic fragment 
from ^E7 encompassing exons 3, 4 and 5 (fragment E7 . 8 in 
5 Figure 3). A number of clones were isolated. Of these an 
apparently full-length clone appeared to encode an unusual 
type of cDNA for SBE I. This cDNA has been termed SBE I-D2 
type cDNA. The putative protein product is compared with 
the maize SBE I and rice SBE I type deduced amino acid 

10 sequences in Figure 4. The main difference is that this 

putative protein product is shorter at the C-terminal end, 
with an estimated molecular size of approximately 74 kD 
compared with 85 kDa for rice SBE I (Kawasaki et al, 1993) . 
Note that amino acids corresponding to exon 9 of rice are 

15 missing in SBE I-D2 type cDNA, but those corresponding to 
exon 10 are present. There are no amino acid residues 
corresponding to exons 11-14 of rice; furthermore, the 
sequence corresponding to the last 57 amino acids of 
SBE I-D2 type has no significant homology to the sequence of 

2 0 the rice gene. 

We expressed SBE I-D2 type cDNA in E. coli in 
order to examine its function. The cDNA was expressed as a 
fusion protein with 22 N-terminal residues of (3-galacto- 
sidase and two threonine residues followed by the SBE I-D2 

25 cDNA sequence either in or out of frame. Although an 

expected product of about 7 5 kDa in size was produced from 
only the in-frame fusion, we could not detect any enzyme 
activity from crude extracts of E. coli protein. 
Furthermore the in- frame construct could not complement an 

30 E. coli strain with a defined deletion in glycogen 

branching, although other putative branching enzyme cDNAs 
have been shown to be functional by this assay (data not 
shown) . It is therefore unclear whether the wSBE I-D2 gene 
in AE7 codes for an active enzyme in vivo. 
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Example 5 Gene Structure in E7 

i. Sequence of wSBE I-D2 

We sequenced 9.2 kb of DNA that contained 
wSBE I-D2. This corresponds to fragments 7.31, 7.8 and 
7.18. Fragment 7.31 was sequenced in its entirety (4.1 kb) 
but the sequence of about 30 bases about 2 kb upstream of 
the start of the gene could not be obtained because it was 
composed entirely of Gs . Elevation of the temperature of 
sequencing did not overcome this problem. Fragments 7 8 
(1 kb) and 7.18 (4 kb) were completely sequenced, and 
corresponded to 2 kb downstream of the last exon detected 
for this gene. it was clear that we had isolated a gene 
which was closely related (approximately 95% sequence 
identity) to the SEE I-D2 type cDNA referred to above 
except that the last 200 bp at the 3' end of the 
not present. The wSBE I-D2 gene includes sequences 
corresponding to rice exon 11 which are not in the cDNA 
clone. m addition it does not have exons 9, 12, 13 or 14 - 

these are also absent from the> c; R p. T r>o - 

.LAum cne SBE I-D2 type cDNA. The 

first two exons show lower identity to the corresponding 
exons from rice (approximately 60%) (Kawasaki et al, 1993) 
than to the other exons (about 80%) . A diagrammatic exon- 
xntron structure of the wSBE I-D2 gene is' indicated in 
Figure 5. The restriction map was confirmed by sequencing 
the PCR products that spanned fragments 7.18 and 7.8 and 7 8 
and E7.31 (see Figure 3) respectively. 

ii. Sequence of wSBE I-D3 

This gene was not sequenced in detail, as the 
genomic clone did not extend far enough to include the 5' 
end of the sequence. The sequence is of a SBE-I type The 

orientation of the gene i<5 <=>-<,■•; * 

yene is evident from sequencing of the 

relevant BarnHI fragments, and was confirmed by sequence 
analysis of a PCR product generated using primers from the 
right arm of lambda and a primer from the middle of the 
gene. The sequence homology with wSBEI-D2 is about 80% over 
the regions examined. The 2 kb sequenced corresponded to 
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exons 5 and 6 of the rice gene; these sequences were 
obtained by sequencing the ends of fragments 7.5, 7 4 and 

•14 respectively, although the sequences from the left end 
of fragment 7.14 did not show any homology to the rice 
sequences. The gene does not appear to share the 3- en d of 
SBE I-D2 type cDNA, as a probe from 500 bp at the 3' end of 
the cDNA (including sequences corresponding to exons 8 and 
10 reprice, did not hybridise to fragment 7.14, although 
it hybridxsed to fragment 7.18. 

lii. Sequence of wSBE I-Dl 

This gene was also not sequenced in detail as it 
was clear that the genomic clone did not extend far enough 
to include the 5' sequences. Limit ed sequencing suggests 

to a theYr 1SO " 1 tYPe ^ ThS orientation relative 

the left arm of lambda was confirmed by sequencing a PCR 
Product that used a primer from the left arm of lambda and 
one from the middle of the gene (as above) . Its seg uence 
homology with wSBE I-D2 , D3 and D4 (see below) is about 75% 
xn the region sequenced corresponding to a part of exon 4 of 
the rice gene. 

Starch branching enzymes are members of the cc- 
amylase protein family, and in a recent survey Svensson 
(1994) identified eight residues in this family that are 
invariant, seven in the catalytic site and a glycine in a 
short turn. Of the seven catalytic residues, four are 
changed in SBE I- D2 type. However, additional variation in 
nl C ! nSerVed ' reS±dueS »»y come to light when more plant 

30 l f ° rbranChin * I -e available for analysis. In 

30 addition, although exons 9, llf 12 , 13 and 14 frQm r±ce ^ 

not present in the SBE I-D2 type cDNA, comparison of the 
maize and rice SBE I sequences indicate that the 3' region 
(from ammo acid residue 730 of maize) is much more variable 
than the 5 - and central regions. The active sites of rice 
and maize SBE I sequences, as indicated by Svensson (1994) 
are encoded by sequences that are in the central portion of 
the gene. When SBE II sequences from Arabi do psis were 
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compared by Fisher et al (1996) they also found variation at 
the 3' and 5' ends. SBE I-D2 type cDNA may encode a novel 
type of branching enzyme whose activity is not adequately 
detected in the current assays for detecting branching 
5 enzyme activity; alternatively the cDNA may correspond to an 
endosperm mRNA that does not produce a functional protein. 

Example 6 Cloning of the cDNA corresponding to the 

wSBE I-D4 gene 

10 T he first strand cDNAs were synthesized from 1 [ig 

of total RNA, derived from endosperm 12 days after 
pollination, as described by Sambrook et al (1989) , and then 
used as templates to amplify two specific cDNA regions of 
wheat SBE I by PGR. 

15 Two pairs of primers were used to obtain the cDNA 

clones BED1 and BED3 (Table 1) . Primers used for cloning of 
BED3 were the degenerate primer NTS 5 1 

5' GGC NAC NGC NGA G/AGA C/TGG 3' ( SEQ ID NO . 1 ) , 

20 

based on the N-terminal sequence of the purified 
wheat endosperm SBE I protein, in which the 5' end of the 
primer is at position 168 of wSBE I-D4 cDNA, as shown in 
Table 1, based on the N-terminal sequence of wheat SBE I, 
25 and the primer NTS 3 ' . 

5 T TAC ATT TCC TTG TCC ATCA 3' (SEQ ID NO . 2 ) 

in which the 5' end is at position 1590 of 
30 wSBE I-D4 cDNA, (see Table 1), designed to anneal to the 
conserved regions of the nucleotide sequences of BED5 and 
the maize and rice SBE I cDNAs . For clone BED1 , the 
primers used were BEC5 ' 



3 5 5 f ATC ACG AGA GCT TGC TCA 



(SEQ ID NO. 3) 
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in which the 5' end is at position 1 of wSBE I-D4 
CDNA (see Table 1, ; the sequence was based on the wSBE I-D4 
gene , and BEC3 ' 



10 



25 



30 



CGG TAG ACA GTT GCG TGA TTT TC 3 ' (SEQ ID NO . 4 ) 

in which the 5' end is at position 334 of 

WSBE I-D4 cDNA (see Tabl<=> i ) =r,^i 

, ^ ia ° le 1] ' and th e sequence was based on 

BED 3 . 



35 



EXamPle 7 Identificati on of the gene from the Tritium 

tauschii SBE I family wh-i ch is expressed in 

the endosperm 

, c , WS haVG isolate d two classes of SBE I genomic 

clones from T. tauschii. One class contained two genomic 
clone isolates, and this class has been characterised in 
some detail (Rahman et al, 1997). The complete gene 
contained within this class of clones was termed wSBE I-D2- 
there were additional genes at either ends of the clone, and 
2 0 these were designated wSBE I-Dl and wSBE I-D3 . The other 
class contained nine genomic clone isolates. Of these XE1 
was arbitrarily taken as a representative clone, and its 
restriction map is shown in Figure 3; the SBE I gene 
contained in this clone was called wSBE I-D4. 

Fragments El . 1 ( 0 .8 kb) and El. 2 (2.1 kb) and 

fragments El . 7 (4.8 kb) anH pi c ,-> i 

kd) and El. 5 (3 kb) respectively were 

completely sequenced. Fragment El . 7 was found to encode the 
N- terminal of the SBE I, which is found in the endosperm as 
described in Morell et al (1997) . This is shown in 
Fxgure 6. Using antibodies raised against the N-terminal 
sequence, Morell et al (1997) found that the D genome 
.soforxn was the most highly expressed in the cultivars 
Rosella and Chinese Spring. We have thus isolated from 
T. tauschzi a gene, wSBE I-D4, whose homologue in the 
hexaploid wheat genome encodes the major isoform for SBE I 
that is found in the wheat endosperm. 
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Table 1 

Location of structural features and probes within wSBE I-D4 

sequence . 



A. Location of 
Repellin et al 



exons by 
, (1997). 



comparison 
Accession 



with the cDNA sequence of 
number Y12320. 



Exon number 



10 



15 



20 



Start posn 



End posn 



1 


4890 


4987 


2 


5082 


5149 


3 


5524 


5731 


4 


5819 


5888 


5 


6149 


6318 


6 


6519 


7424 


7 


7744 


7860 


8 


8015 


8077 


9 


8562 


8670 


10 


9137 


9237 


11 


9421 


9488 


12 


9580 


9661 


13 


9781 


9897 


14 


9990 


10480 



25 



B. Other features 



Name of feature 



3 0 Putative initiation of translation 

Mature N- terminal sequence of SBE I 

End of translated SBE I sequence 

End of D4 cDNA sequence 

wSBE I-D45 
35 wSBE I-D43 

El.l 

BED 1 

BED 2 

BED 3 
40 BED 4 

BED 5 

Endosperm box like motif TGAAAAGT 
CAAAT motif 

TAT AAA motif 



wSBE I-D4. 
sequence 

4900 
5550 
10225 
10461 
4870, 5860 
10116, 10435 
5680, 6400 



4480, 590 
4863 

4833 



D4 cDNA 
sequence . 

11 
124 
2431 
2687 
1. 354 
2338, 2657 
380, 630 
1, 354 
169, 418 
151, 1601 
867, 2372 
867, 2687 
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All nine genomic clones of the A.E1 type isolated 
from T. tauschii appear to contain the wSBE I-D4 gene, or 
very similar genes, on the basis of PCR amplification and 
hybridisation experiments. However, the restriction 
5 patterns obtained for the clones differ with BamHI and 
EcoRl, among other enzymes, indicating that either the 
clones represent near-identical but distinct genes or they 
represent the same gene isolated in distinct products of the 
Sau3A digest used to generate the library. 

10 

Example 8 Investigation of other SBE I genomic clones 

isolated 

All ten members of the A.El-like class of SBE I 
genomic clones were investigated by hybridisation with 
15 probes derived from fragment El . 7 (sequence wSBE I-D45, 
encoding the translation start signal and the first 
100 amino acids from the N-terminal end and intron 
sequences; see Table 1) and from fragment El . 5 (sequence 
wSBE I-D43, corresponding largely to the 3' untranslated 

2 0 sequence and containing intron sequences, see Table 1) . The 

results obtained were consistent with one type of gene being 
isolated in different fragments in the different clones, as 
shown in Figure 7. The PCR products were obtained from the 
clones XE1, 2, 9, 14, 27, 31 and 52. These hybridised to 
25 wSBE I-D45 using primers that amplify near the 5' end of the 
gene (positions 5590-6162 of wSBE I-D4) . Sequencing showed 
no differences in sequence of a 200 bp product. 

Analysis of the promoter for wSBE I-D4 allows us 
to investigate the presence of motifs previously described 

3 0 for promoters that regulate gene expression in the 

endosperm. Forde et al (1985) compared prolamin promoters, 
and suggested that the presence of a motif approximately 
-300 bp upstream of the transcription start point, called 
the endosperm box, was responsible for endosperm-specific 
35 expression. The endosperm box was subsequently considered 

to consist of two different motifs: the endosperm motif (EM) 
(canonical sequence TGTAAAG) and the GCN 4 motif (canonical 
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sequence G/ATGAG/CTCAT) . The GCN4 box is considered to 
regulate expression according to nitrogen availability 
(Muller and Knudsen, 1993) . The wSBE I-D4 promoter contains 
a number of imperfect EM-like motifs at approximately -100, 
-300 and -400 as well as further upstream. However, no GCN4 
motifs could be found, which lends support to the idea that 
this motif regulates response to nitrogen, as starch 
biosynthesis is not as directly dependent on the nitrogen 
status of the plant as storage protein synthesis. Comparison 
of the promoters for wSBE I-D4 and D2 (Rahman et al, 1997) 
indicates that although there are no extensive sequence 
homologies there is a region of about 100 bp immediately 
before the first encoded methionine where the homology is 
61% between the two promoters. In particular there is an 
almost perfect match in the sequence over twenty base pairs 
CTCGTTGCTTCC / TACTCCACT , (positions 4723-4742 of the wSBE I 
sequence), but the significance of this is hard to gauge, as 
it does not occur in the rice promoter for SBE I. The 
availability of more promoters for starch biosynthetic 
enzymes may allow firmer conclusions to be drawn. There are 
putative CAAT and TATA motifs at positions 4870 and 4830 
respectively of wSBE I-D4 sequence. The putative start of 
translation of the mRNA is at position 4900 of wSBE I-D4. 

Figure 5 shows the structure of the wSBE I-D4 
gene, compared with the genes from rice and wheat (Kawasaki 
et al, 1993; Rahman et al , 1997) . The rice SBE I has 14 
exons compared with 13 for wSBE I-D4 and 10 for wSBE I-D2. 
There is good conservation of exon-intron structure between 
the three genes, except at the extreme 5 ? end. In particular 
the sizes of intron 1 and intron 2 are very different 
between rice SBE I and wSBE I-D4. 

Example 9 Isolation of cDNA for SBE I 

Using the maize starch branching enzyme I cDNA as 
a probe (Baba et al, 1991), 10 positive plaques were 
recovered by screening approximately 10 5 plaques from a 
wheat endosperm cDNA library prepared from the cultivar 
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Rosella, as described in Example 4. On purifying and 
sequencing these plaques it was clear that even the longest 
clone (BED5 , 1822 bp) did not encode the N-terminal sequence 
obtained from protein analysis. Degenerate primers based on 
5 the wheat endosperm SBE I protein N-terminal sequence 

(Morell et al , 1997) and the sequence from BEDS were then 
used to amplify the 5' region: this produced a cDNA clone 
termed BED 3 (Table 1 and Figure 8) . This cDNA clone 
overlapped extensively and had 100% sequence identity with 

10 BEDS and BED4 (Figure 8). As almost the entire protein N- 
terminal sequence had been included in the primer sequence 
design, this did not provide independent evidence of the 
selection of a cDNA sequence in the endosperm that encoded 
the protein sequence of the main form of SBE I. Using a 

15 BED3 to screen a second cDNA library produced BED2 , which is 
shorter than BED3 but confirmed the BED3 sequence at 100% 
identity between positions 169 and 418 (Figure 8 and 
Table 1) . In addition the entire cDNA sequence for BED3 
could be detected at a 100% match in the genomic clone XE1 . 

20 Primers based on the putative transcription start point 
combined with a primer based on the incomplete cDNAs 
recovered were then used to obtain a PGR product from total 
endosperm RNA by reverse transcription. This led to the 
isolation of the cDNA clone, BED1 , of 300 bp, whose location 

25 is shown in Figure 8. By analysing this product, a sequence 
was again obtained that could be found exactly in the 
genomic clone XE1, and which overlapped precisely with BED3 . 

The N-terminal of the protein matches that of 
SBE I isolated from wheat endosperm by Morell et al (1997), 

3 0 and thus the wSBE I-D4 cDNA represents the gene for the 

predominant SBE I isoform expressed in the endosperm. The 
encoded protein is 87 kDa; this is similar to proteins 
encoded by maize (Baba et al, 1991) and rice (Nakamura et 
al, 1992) cDNAs for SBE I and is distinct from the wSBE I-D2 

3 5 cDNA described previously, in which the encoded protein was 
74 kDa (Rahman et al, 1997) . 
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Five cDNA clones were sequenced and their 
sequences were assembled into one contiguous sequence using 
a GCG program (Devereaux et al, 1984). The arrangement of 
these sequences is illustrated in Figure 8, the nucleotide 
5 sequence is shown in SEQ ID No : 5 , and the deduced amino acid 
sequence is shown in SEQ ID No : 6 . The intact cDNA sequence, 
wSBE I-D4 cDNA , is 2 687 bp and contains one large open 
reading frame (ORF) , which starts at nucleotides 11 to 13 
and ends at nucleotides 2432 to 2434. It encodes a 

10 polypeptide of 807 amino acids with a molecular weight of 
87 kDa . Comparison of the amino acid sequence encoded by 
wSBE I-D4 cDNA with that encoded by maize and rice SBE I 
cDNAs showed that there is 7 5-80% identity between any of 
two these sequences at the nucleotide level and almost 90% 

15 at the amino acid level. Alignment of these three 

polypeptide sequences, as shown in Figure 4, along with the 
deduced sequences for pea, potato and wSBE I-D2 type cDNA, 
indicated that the sequences in the central region are 
highly conserved, and sequences at the 5 f end (about 

20 80 amino acids) and the 3' end (about 60 amino acids) are 
variable . 

Svensson et al (1994) indicated that there were 
several invariant residues in sequences of the a-amylase 
super-family of proteins to which SBE I belongs. In the 

25 sequence of maize SBE I these are in motifs commencing at 
amino acid residue positions 341, 415, 472, 537 
respectively; these are also encoded in the wSBE I-D4 
sequence (SEQ ID No : 9 ) , further supporting the view that 
this gene encodes a functional enzyme. This is in contrast 

3 0 to the results with the wSBE I-D2 gene, where three of the 
conserved motifs appear not to be encoded (Rahman et al, 
1997) . 

The re is about 90% sequence identity in the 
deduced amino acid sequence between wSBE I-D4 cDNA and rice 
3 5 SBE I cDNA in the central portion of the molecule (between 

residues 160 and 740 for the deduced amino acid product from 
wSBE I-D4 cDNA) . The sequence identity of the deduced amino 
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acid sequence of the wSBE I-D4 cDNA to the deduced amino 
acid sequence of wSBE I-D2 is somewhat lower (85% for the 
most conserved region, between residues 285 to 390 for the 
deduced product of wSBE I-D4 cDNA) . Surprisingly, however, 
5 wSBE I-D4 cDNA is missing the sequence that encodes amino 
acids at positions 30 to 58 in rice SBE I (see Figure 4) . 
This corresponds to residues within the transit peptide of 
rice SBE I. A corresponding sequence also occurs in the 
deduced amino acid sequence from maize SBE I (Baba et al , 

10 1991) and wSBE I-D2 type cDNA (Rahman et al , 1997) . 

Consequently the transit sequence encoded by wSBE I-D4 cDNA 
is unusally short, containing only 3 8 amino acids, compared 
with 55-60 amino acids deduced for most starch biosynthetic 
enzymes in cereals (see for example Ainsworth, 1993; Nair et 

15 al, 1997) . The wSBE I-D4 gene does contain this sequence, 
but this does not appear to be transcribed into the major 
species of RNA from this gene, although it can be detected 
at low relative abundance. This raises the possibility of 
alternative splicing of the wSBE I-D4 transcript, and also 

20 the question of the relative efficiency of 

translation/ transport of the two isoforms. The possibility 
of alternative splicing in both rice and wheat has been 
considered for soluble starch synthase (Baba et al, 1993 
Rahman et al, 1995) . Alternative splicing of soluble starch 

2 5 synthase would give a transit sequence of 40 amino acids, 

which is the same length proposed for the product of 
wSBE I-D4 cDNA. 

We have previously used probes based on exons 4, 5 
and 6 (E7.8 and El.l, see Rahman et al . , 1997) of WSBE-D2 to 

3 0 probe wheat and T. tauschii genomic DNA cleaved with PvuII 

and BamHX respectively. This region is highly conserved 
within rice SBE I, wSBE 1-D2 and wSBE I-D4 and produced ten 
bands with wheat DNA and five with T. tauschii DNA . Neither 
PvuII nor BairMT cleaved within the probe sequences, 
3 5 suggesting that each band represented a single type of SBE I 
gene. We have described four SBE I genes from T. tauschii : 
wSBE I-Dl, wSBE I-D2 , wSBE I-D3 and wSBE I-D4 (Rahman et al , 
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1997 and this specification) , and so we may have accounted 
for most of the genes in T. tauschii and, by extension, the 
genes from the D genome of wheat. In wheat, at least two 
hybridising bands could be assigned to each of 
5 chromosomes 7A, 7B and 7D. 

Example 10 Tissue specificity and expression during 

endosperm development 
The 300 bp of 3' untranslated sequence of 

10 wSBE I-D4 cDNA does not show any homology with either the 
wSBE I-D2 type cDNA that we have described earlier (Rahman 
et al, 1997) or with BE-I from rice, as shown in Figure 5. 
We have called this sequence wSBE I-D43C (see SEQ ID No: 9) . 
It seemed likely that wSBE I-D43C would be a specific probe 

15 for this class of SBE-I, and thus it was used to investigate 
the tissue specificity. Hybridization of RNA from endosperm 
of hexaploid T. tauschii cultures with SBE I, SBE II, SSS I, 
DBE I, wheat actin, and wheat ribosomal RNA was examined. 
RNA was purified at various numbers of days after anthesis 

20 from plants grown with a 16 h photoperiod at 13 °C (night) 
and 18 °C (day) . The age of the endosperms from which RNA 
was extracted in days after anthesis is given above the 
lanes in the blot . Equivalent amounts of RNA were 
electrophoresed in each lane. The probes used are identified 

25 in Tables 1 and 2. 

The results are shown in Figures 9a to 9g. An RNA 
species of about 2700 bases in size was found to hybridise. 
This is very close to the size of the wSBE I-D4 cDNA 
sequence. RNA hybridising to wSBE-I-D43C is most abundant 

3 0 at the mid-stage of endosperm development, as shown in 
Figure 9a, and in field grown material is relatively 
constant during the period 12-18 days, the time at which 
there is rapid starch and storage protein accummulation 
(Morell et al , 1995) . 

3 5 The sequence contained within the wSBE I-D4 gene 

appears to be expressed only in the endosperm (Figure 9a, 
Figure 9b). We could not detect any expression in the leaf. 
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This could be because another isoform is expressed in the 
leaf, and/or because the amount of SBE I present in the leaf 
is much less than what is required in the endosperm. 
Isolation of SBE I clones from a leaf cDNA library would 
5 enable this question to be resolved. 

Example 11 Intron-Exon Structure of SBE I 

By comparison of the cDNA sequence of SBE I 
(Repellin et al , 1997) with that of wSBE I-D4 we can deduce 

10 the intron-exon structure of the gene for the major isoform 
of SBE I that is found in the endosperm. The structure 
contains 14 exons compared to 14 for rice (Kawasaki et al, 
1993). These 14 exons are spread over 6 kb of sequence, a 
distance similar to that found in both rice SBE I and 

15 wSBE I-D2. A dotplot comparison of wSBE I-D4 sequence and 
that of rice SBE I sequence, depicted in Figure 10,' shows 
good sequence identity over almost the entire gene starting 
from about position 5100 of wSBE I-D4; the identity is poor 
over the first 5 kb of sequence corresponding largely to the 

20 promoter sequences. The sequence identity over introns 
(about 60%) is lower than over exons (about 85%) . 

Example 12 Repeated Sequences in SBE I 

Sequencing of wSBE I-D4 revealed there was a 

25 repeated sequence of at least 3 00 bp contained in a 2kb 

fragment about 600 bp after the 3' end of the gene. We have 
called this sequence wSBE I-D4R { SEQ ID NO: 9) . This 
repeated sequence is within fragment El . 5 (Figure 3 and 
Table 1) and is flanked by non-repetitive sequences from the 

30 genomic clone. We have previously shown that the 

restriction pattern obtained by digesting XE1 with the 
restriction enzyme BamRl is also obtained when T. tauschii 
DNA is digested. Thus wSBE I-D4R is unlikely to be a 
cloning artefact. A search of the GenBank Database revealed 

3 5 that wSBE I-D4R shared no significant homology with any 

sequence in the database. Hybridisation experiments with 
wSBE I-D4R showed that all of the other SBE I-D4 type 
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genomic clones (except number 29) contained this repeated 
sequence (data not shown) . The wSBE I-D4R sequence was not 
highly repeated and occurred in the wheat genome with a 
similar frequency as the wSBE I-D4 sequence. 
5 When SBE I-D4R was used as the probe on wheat DNA 

from the nulli-tetra lines, four bands were obtained; two of 
these bands could be assigned to chromosome 7A and the 
others to chromosomes 7B and 7D (Figure 11) . One of the two 
BamHI fragments from wheat DNA which could be assigned to 

10 chromosome 7A was distinct from the single band from 

chromosome 7A detected using wSBE I-D43 as the probe; the 
other three bands coincided in the autoradiograph with bands 
obtained with wSBE I-D43 , and are likely to represent the 
same fragment. However, one of these fragments was distinct 

15 from the BajnHI fragment that hybridised to the wSBE I-D43 
sequence. In wSBE I-D4 (see SEQ ID No : 9 ) , the wSBE I-D43 
sequence is only 3 00 bp upstream of wSBE I-D4R, and occurs 
in the same BamHI fragment. These results suggest that the 
wSBE I-D4R sequence can occur independently of wSBE I-D4 in 

2 0 the wheat genome. 

Example 13 Isolation of Genomic Clones Encoding SBE II 

Screening of a cDNA library, prepared from the 
wheat endosperm as described in Example 4, with the maize 
25 BE I clone (Baba et al, 1991) at low stringency led to the 

isolation of two classes of positive plaques. One class was 
strongly hybridising, and led to the isolation of wheat 
SBE I-D2 type and SBE I-D4 type cDNA clones, as described in 
Example 5 and in Rahman et al (1997). The second class was 

3 0 weakly hybridising, and one member of this class was 

purified. This weakly hybridising clone was termed SBE- 9, 
and on sequencing was found to contain a sequence that was 
distinct from that for SBE I. This sequence showed greatest 
homology to maize BE II sequences, and was considered to 
35 encode part of the wheat SBE II sequence. 

The screening of approximately 5 x 10 5 plaques 
from a genomic library constructed from T. tauschii (see 
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Example 1) with the SBE-9 sequence led to the isolation of 
four plaques that were positive. These were designated 
wSBE JJ-D1 to wSBE II-D4 respectively, and were purified and 
analysed by restriction mapping. Although they all had 
5 different hybridization patterns with SBE-9 , as shown in 

Figure 12, the results were consistent with the isolation of 
the same gene in different-sized fragments. 

Example 14 Identification of the N-terminal sequence of 

10 SBE II 

Sequencing of the SBE II gene contained in 
clone 2, termed SBE II-D1 (see SEQ ID No:10), showed that it 
coded for the N-terminal sequence of the major isoform of 
SBE II expressed in the wheat endosperm, as identified by 

15 Morell et al (1997). This is shown in Figure 13. 

Example 15 Intron-Exon Structure of the SBE II Gene 

In addition to encoding the N-terminal sequence of 
sBE II, as shown in Example 10, the cDNA sequence reported 
20 by Nair et al (1997) was also found to have 100% sequence 

identity with part of the sequence of wSBE II-D1. Thus the 
intron-exon structure can be deduced, and this is shown in 
Figure 14. The positions of exons and other major structural 
features of the SBE II gene are summarized in Table 2. 

25 

Example 16 Number of SBE II Genes in T. tauschii and 

Wheat 

Hybridisation of the SBE II conserved region with 
T. tauschii DNA revealed the presence of three gene classes. 
3 0 However, in our screening we only recovered one class. 
Hybridisation to wheat DNA indicated that the locus for 
SBE II was on chromosome 2, with approximately 5 loci in 
wheat; most of these appear to be on chromosome 2D, as shown 
in Figure 15 . 



• s 
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Table 2 

Positions of structural features in wSBE II-D1. 

5 A. Positions of exons . 



15 



25 



30 



number 


Genomic 


Genomic 




start 


finish 


1 


1058 


1336 


2 


1664 


1761 


3 


2038 


2279 


4 


2681 


2779 


5 


2949 


2997 


6 


3145 


3204 


7 


3540 


3620 


8 


3704 


3825 


9 


4110 


4188 


10 


4818 


4939 


11 


5115 


5234 


12 


6209 


6338 


13 


6427 


6549 


14 


6739 


6867 


15 


7447 


7550 


16 


8392 


8536 


17 


9556 


9703 


18 


9839 


9943 


19 


10120 


10193 


20 


10395 


10550 


21 


10928 


11002 


22 


11092 


11475 



35 



B. Other structural features within the wSBE II-D1 DNA 
sequence 



Putative initiation of translation 1214 

Mature N-terminal sequence of SBE II. 1681 

wSBE II-D13 11116 to 11448 

Endosperm box like motif TGAAAAGT 521 

40 Endosperm box like motif TGAAAGT 565 

Endpsperm box like motif CGAAAAT 669 

Endosperm box like motif TAAATGT 7 68 

CAAAAT motif 7 84 

TCAATT motif 110 8 

45 TATAAA motif 799 

AATTAA motif 1110 
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Example 17 Expression of SBE II 

Investigation of the pattern of expression of 
SBE II revealed that the gene was only expressed in the 
endosperm. However the timing of expression was quite 
5 distinct from that of SBE I, as illustrated in 
Figures 9a, 9b and 9c. 

SBE I gene expression is only clearly detectable 
from the mid-stage of endosperm development (10 days after 
anthesis in Figure 9b) , whereas SBE II gene expression is 
10 clearly seen much earlier, in endosperm tissue at 5-8 days 
after development (Figures 9a and 9c) , corresponding to an 
early stage of endosperm development. The hybridisation of 
wheat endosperm mRNA with the actin and ribosomal RNA genes 
is shown as controls (Figures 9fa and 9g, respectively) . 

15 

Example 18 Cloning of Wheat Soluble Starch Synthase 

cDNA 

A conserved sequence region was used for the 
synthesis of primers for amplification of SSS I by 

20 comparison with the nucleotide sequences encoding soluble 
starch synthases of rice and pea. A 3 00 bp RT-PCR product 
was obtained by amplification of cDNA from wheat endosperm 
at 12 days post anthesis. The 3 00 bp RT-PCT product was 
then cloned, and its sequence analysed. The comparison of 

25 its sequence with rice SSS cDNA showed about 80% sequence 

homology. The 3 00 bp RT-PCR product was 100% homologous to 
the partial sequence of a wheat SSS I in the database 
produced by Block et al (1997) . 

The 3 00 bp cDNA fragment of wheat soluble starch 

30 synthase thus isolated was used as a probe for the screening 
of a wheat endosperm cDNA library (Rahman et al , 1997) . 
Eight cDNA clones were selected. One of the largest cDNA 
clones (sm2) was used for DNA sequencing analysis, and gave 
a 2662 bp nucleotide sequence, which is shown in SEQ ID 

3 5 NO: 14. A large open reading frame of this cDNA encoded a 
647 amino acid polypeptide, starting at nucleotides 247 to 
250 and terminating at nucleotides 2198 to 2200. The 
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deduced polypeptide was shown by protein sequence analysis 
to contain the N-terminal sequence of a 75 kDa granule-bound 
protein (Rahman et al, 1995) . This is illustrated in 
Figure 16. The location of the 75 kDa protein was 
5 determined for both the soluble fraction and starch granule- 
bound fraction by the method of Denyer et al (1995) . Thus 
this cDNA clone encoded a polypeptide comprising a 41 amino 
acid transit peptide and a 606 amino acid mature peptide 
(SEQ ID NO: 12) . The cleavage site LRRL was located at amino 
10 acids 36 to 39 of the transit peptide of this deduced 
polypeptide . 

Comparison of wheat SSS I with rice SSS and potato 
SSS showed that there is 87.4% or 75.9% homology at the 
amino acid level and 74.7% or 58.1% homology at the 
15 nucleotide level. Some amino acids in the at N-terminal 
sequences of the SSS I of wheat and rice were conserved. 
Major features of the SSS I gene are summarized in Table 3. 



Example 19 Isolation of Genomic Clone of Wheat Soluble 

2 0 Starch Synthase 

Seven genomic clones were obtained with a 3 00 bp 
cDNA probe by screening approximately 5 x 10 5 plaques from a 
genomic DNA library of Triticum tauschii , as described 
above. DNA was purified from 5 of these clones and digested 
25 with BamHI and Sad . Southern hybridization analysis using 
the 3 00 bp cDNA as probe showed that these clones could be 
classified into two classes, as shown in Figure 17. One 
genomic clone, sg3 , contained a long insert, and was 
digested with BamHI or Sacl and subcloned into pBluescript 

3 0 KS+ vector. 
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Table 3 

Comparison of exons and introns of soluble starch synthases 

I genes of wheat and rice 

(1) Identity of exons of soluble starch synthase I genes of 
5 wheat and rice 



Exons wSSI-Dl rSSI identity (%) start site stop site 













\ W O i_> -1- 1~J -L / 






la 


255 


113 


57 . 52 


-J J 


n 


10 


lb 


316 


298 


58 92 


1 
-L 


-J J. o 




2 


356 


356 


82 .87 


1473 


1828 




3 


78 


78 


92 .31 


2746 


2823 




4 


125 


125 


90 .40 


2906 


3028 




5 


82 


82 


89 . 02 


4113 


4194 


15 


6 


174 


174 


93 . 10 


4286 


4459 




7 


82 


82 


93 .90 


4562 


4643 




8 


92 


92 


92 .39 


4743 


4835 




9 


63 


63 


90 .48 


4959 


5021 




10 


90 


90 


82 .22 


5103 


5192 


20 


11 


125 


125 


88 .80 


. 8594 


8718 




12 


109 


109 


91 .74 


8807 


8915 




13 


53 


53 


81.13 


8992 


9044 




14 


40 


41 


80 .00 


9160 


9199 




15a 


159 


113 


79 .65 


9499 


9657 


25 


15b 


392 


539 


46.46 


9658 


10098 



(2) Identity of introns of soluble starch synthase I genes 
of wheat and rice 



30 


Introns 


wSSI-Dl 


rSSI 


identity (%) 


start 


site stop 












(wSSI 


-Dl) (wSSI 




1 


1156 


907 


41.05 


317 


1472 




2 


917 


851 


41 . 65 


1829 


2745 




3 


82 


87 


45.12 


2824 


2905 


35 


4 


1084 


835 


48 .50 


3029 


4112 




5 


91 


96 


57 .78 


4195 


4285 




6 


102 


189 


52 .48 


4460 


4561 




7 


99 


96 


52 .08 


4644 


4742 




8 


123 


110 


45 .46 


4836 


4958 


40 


9 


81 


78 


58 .97 


5022 


5102 




10 


3401 


663 


37 .56 


5193 


8593 




11 


88 


124 


56 .82 


8719 


8806 




12 


76 


81 


48 . 68 


8916 


8991 




13 


115 


135 


45 .22 


9045 


9159 


45 


14 


299 


830 


45 . 80 


9200 


9498 



Note: Exon la: non-coding region of exon 1. Exon lb: coding 
region of exon 1 . 

Exon 15a: coding region of exon 15. Exon 15b: non- 
coding region of exon 15. 
50 wSSI-Dl: wheat soluble starch synthase I gene. 

rSSI: rice soluble starch synthase I gene. 
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These subclones were analysed by sequencing. The 
intron/exon structure of the sg3 rice gene is shown in 
Figure 18. The SSS I gene from T. tauschii is shown in SEQ 
ID No: 13, while the deduced amino acid sequence is shown in 
5 SEQ ID NO: 14 . 

Example 2 0 Northern Hybridization Analysis of the 

Expression of Genes Encoding Soluble Starch 
Synthase 

10 Total RNAs were purified from leaves, pre-anthesis 

material, and various stages of developing endosperm at 5-8, 
10-15 and 18-22 days post anthesis. Northern hybridization 
analysis showed that mRNAs encoding wheat SSS I were 
specifically expressed in developmental endosperm. 

15 Expression of this mRNAs in the leaves and pre-anthesis 

materials could not be detected by northern hybridization 
analysis under this experimental condition. Wheat SSS I 
mRNAs started to express at high levels at an early stage of 
endosperm, 5-8 days post anthesis, and the expression level 

20 in endosperm at 10-15 days post anthesis, was reduced. 

These results are summarized in Figure 9a and Figure 9d. 

Example 21 Genomic Localisation of Wheat Soluble Starch 

Synthase 

25 DNA from chromosome engineered lines was digested 

with the restriction enzyme BamHI and blotted onto supported 
nitrocellulose membranes. A probe prepared from the 3 f end 
of the cDNA sequence, from positions 2345 to 2548, was used 
to hybridise to this DNA. The presence of a specific band 

30 was shown to be associated with the presence of 

chromosomes 7 A (Figure 19) . These data demonstrate location 
of the SSS I gene on chromosome 7 . 

Example 2 2 Isolation of SSS I Promoter 

3 5 We have isolated the promoter that drives this 

pattern of expression for SSS I. The pattern of expression 

for SSS I is very similar to that for SBE II: the SSS I gene 
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transcript is detectable from an early stage of endosperm 
development until the endosperm matures. The sequence of 
this promoter is given in SEQ ID No: 15. 

5 Example 2 3 Isolation of the Gene Encoding Debranching 

Enzyme from Wheat 
The sugary-1 mutation in maize results in mature 
dried kernels that have a glassy and translucent appearance ; 
immature mature kernels accumulate sucrose and other simple 

10 sugars, as well as the water-soluble polysaccharide 

phytoglycogen (Black et al, 1966) . Most data indicates that 
in sugary- 1 mutants the concentration of amylose is 
increased relative to that of amylopec tion . Analysis of a 
particular sugary-1 mutation (su-lRef) by James et al , 

15 (1995) led t.o the isolation of a cDNA that shared 

significant sequence identity with bacterial enzymes that 
hydrolyse the a 1 , 6-glucosyl linkages of starch, such as an 
isoamylase from Pseudomonas (Amemura et al , 1988), ie . 
bacterial debranching enzymes. 

20 We have now isolated a sequence amplified from 

wheat endosperm cDNA using the polymerase chain reaction 
(PCR) . This sequence is highly homologous to the sequence 
for the sugary gene isolated by James et a J , (1995) . This 
sequence has been used to isolate homologous cDNA sequences 

25 from a wheat endosperm library and genomic sequences from 
Triticum tauschii. 

Comparison of the deduced amino acid sequences of 
DBE from maize with spinach (Accession SOPULSPO, GenBank 
database), Pseudomonas (Amemura et al , 1988) and rice 

3 0 (Nakamura et al, 1997) enabled us to deduce sequences which 
could be useful in wheat. When these sequences were used as 
PCR amplification primers with wheat genomic DNA a product 
of 2 56 bp was produced. This was sequenced and was compared 
to the sequence of maize sugary isolated by James et al, 

3 5 (1995) . The results are shown in Figure 20a and Figure 20b. 
This sequence has been termed wheat debranching enzyme 
sequence I (WDBE-I) . 
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WDBE-1 was used to investigate a cDNA library 
constructed from wheat endosperm (Rahman et al, 1997) 
enables us to isolate two cDNA clones which hybridise 
strongly to the WDBE-I probe. The nucleotide sequence of 
5 the DNA insert in the longest of these clones is given in 
SEQ ID No : 16 . 

Use of WDBE 1 to investigate a genomic library 
constructed from T. tauschii , as described above has led to 
the isolation of four genomic clones, designated II, 12, 13 

10 and 14, respectively, which hybridised strongly to the 

WDBE-I sequence. These clones were shown to contain copies 
of a single debranching enzyme gene. The sequence of one of 
these clones, 12, is given in SEQ ID No: 17. The intron/exon 
structure of the gene is shown in Figure 2 0c. Exons 1 to 4 

15 were identified by comparison with the maize sugrary-1 cDNA, 
while Exons 5 to 18 were identified by comparison with the 
cDNA sequence given in SEQ ID No:16. The major features of 
the DBE I gene are summarized in Table 4. 

Hybridization of WDBE-I to DNA from T. tauschii 

20 indicates one hybridizing fragment (Figure 21a) . The 
chromosomal location of the gene was shown to be on 
chromosome 7 through hybridisation to nullisomic/tetrasomic 
lines of the hexaploid wheat cultivar Chinese Spring 
(Figure 21b) . 

25 We have clearly isolated a sequence from the wheat 

genome that has high identity to the debranching enzyme cDNA 
of maize characterised by James et al (1997) . The isolation 
of homologous cDNA sequences and genomic sequences enables 
further characterisation of the debranching enzyme cDNA and 

3 0 promoter sequences from wheat and T . tauschii . These 

sequences and the WDBE I sequences shown herein are useful 
in the manipulation of wheat starch structure through 
genetic manipulation and in the screening for mutants at the 
equivalent sugary locus in wheat . 

35 Figure 9e shows that the DBE I gene is expressed 

during endosperm development in wheat and that the timing of 
expression is similar to the SBEII and SSSI genes. Figure 9h 
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shows that the full length mRNA for the gene (3.0 kb) is 
found only in the wheat endosperm. 

Example 2 4 Transient assays of Promoter-GFP Fusions 

5 DNA constructs 

DNA constructs for transient expression assays 
were prepared by fusing sequences from the BEII and SSI 
promoters to the gene encoding the Green Fluorescent 
Protein. Green Fluorescent Protein (GFP) constructs 
10 contained the GFP gene described by Sheen et al . (1995) . The 
nos 3' element (Bevan et al . , 1983) was inserted 3' of the 
GFP gene. The plasmid vector (pWGEM__NZfp) was constructed by 
inserting the NotI to Hindlll fragment from the following 
sequence : 

15 

5' GCGGCCGCTC CCTGGCCGAC TTGGCCGAAG CTTGCATGCC TGCAGGTCGA 
CTCTAGAGGA TCCCCGGGTA CCGAGCTCGA ATTCATCGAT GATATCAGAT 
CCGGGCCCTC TAGATGCGGC CGCATGCATA AGCTT 3 ' 

20 into the NotI and Hindi 1 1 sites of pGem-13Zf <-) vector 

(Promega) . The sequences at the junction of the wSSSIprol 
and wSSSIpro2 and GFP were identical, and included the 
junction sequence : 

2 5 5' . . . .CGCGCGCCCA CACCCTGCAG GTCGACTCTA GAGGATCCAT GGTGAGCAAG 

3/ . 

The sequence at the junction of wsbellprol and GFP was: 

3 0 5' GCGACTGGCT GACTCAATCA CTACGCGGGG ATCCATGGTG AGCAAGGGCG 

3 ' . 

The sequence at the junction of wsbeIIpro2 and GFP was: 

5' GGACTCCTCT CGCGCCGTCC TGAGCCGCGG ATCCATGGTG AGCAAGGGCG 
35 3 ' . 

The structures of the constructs are shown in Figures 22a to 

22f . 
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Table 4 

Structural features of wDBEI-Dl 

A* 

Position 
of exons 



Exon Start End Comments 

number positi posit 

on ion 

1 1890 2241 (deduced by comparison with maize) 

2 23 42 2524 (deduced by comparison with maize) 

3 2615 2707 (deduced by comparison with maize) 

4 3016 3168 (deduced by comparison with maize) 

5 3360 3436 

6 4313 4454 

7 4526 4633 

8 4734 4819 

9 5058 5129 

10 5202 5328 

11 5558 5644 

12 6575 6671 

13 7507 7661 

14 8450 8527 

15 8739 8823 

16 8902 8981 

17 9114 9231 

18 Still 



being 

sequen 

ced 

5 Note that following nucleotides 3330, 6330 and 8419 there 
may be short regions of DNA not yet sequenced. 

B. 

CAAAAT motif 1833 

10 TCAAT motif 1838 

ATAAATAA motif 1804 

Endosperm box like motif TAAAACG 1463 
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Preparation of target tissue 

All explants used for transient assay were from 
the hexaploid wheat cultivar, Milliwang. Endosperm (10 - 12 
days after anthesis), embryos (12 - 14 days after anthesis) 
5 and leaves (the second leaf from the top of plants 

containing 5 leaves) were used. Developing seed or leaves 
were collected, surface sterilized with 1.25% w/v sodium 
hypochlorite for 20 minutes and rinsed with sterile 
distilled water 8 times. Endosperms or embryos were 

10 carefully excised from seed in order to avoid contamination 
with surrounding tissues. Leaves were cut into 0.5 cm x 1 
cm pieces. All tissues were aseptically transferred onto 
SD1SM medium, which is an MS based medium containing 1 mg/L 
2,4-D, 150 mg/L L-asparagine, 0.5 mg/L thiamine, 10 g/L 

15 sucrose, 36 g/L sorbitol and 3 6 g/L mannitol. Each agar 

plate contained either 12 endosperms, 12 embros or 2 leaf 
segments . 



Preparation of gold particles and bombardment 

20 Five jig of each plasmid was used for the 

preparation of gold particles, as described by Witrzens et 
al. (1998). Gold particle-DNA suspension in ethanol (10 |xl) 
was used for each bombardment using a Bio-Rad helium-driven 
particle delivery system, PDS-1000. 

25 

GFP assay 

The expression of GFP was observed after 3 6 to 72 
hours incubation using a fluorescence microscope. Two plates 
were bombarded for each construct. The numbers of expressing 
3 0 regions were recorded for each target tissue, and are 

summarized in Table 5. The intensity of the expression of 
GFP from each of the promoters was estimated by visual 
comparison of the light intensity emitted, and is summarized 
in Table 6. 

35 The DNA construct containing GFP without a 

promoter region (pZLGFPNot) gave no evidence of transient 
expression in embryo (panel 1) or leaf (panel r) and 
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extremely weak and sporadic expression in endosperm (panel 
f) , this construct gave only very weak expression in 
endosperm with respect to the number (Figure 5) and 
intensity (Figure 6) of transient expression regions. The 
5 constructs pwssslprolgf pNOT (panels b, h and n) , 

psbellprolgfpNOT (panels d, j and p) , and psbeIIpro2gf pNOT 
(panels e, k and q) yielded low numbers (Table 5) of 
strongly (Table 6) expressing regions in leaves, and there 
was a very uneven distribution of expressing regions between 

10 target leaf pieces (Table 5) . pwsssIpro2gf pNOT (panels c, i 
and o) gave no evidence of transient expression in leaves 
(Table 5) . These results show that each of the promoter 
constructs is able to drive the transient expression of GFP 
in the grain tissues, endosperm and embryo. The ability of 

15 the short SSI promoter (pwsssIpro2gf pNOT containing 1042 bp 
5' of the ATG translation start site) to drive expression in 
leaves (panel n) contrasts with the inability of the long 
SSI promoter (pwsssIpro2gf pNOT containing 3914 base pair 
region 5' of the ATG translation start site, panel o) ) 

20 suggesting that regions for controlling tissue specificity 
are located between -3914 and -1042 of the SSI promoter 
region (SEQ ID No:15). 

Example 25 Stable transformation of rice 

25 Stable transformation of rice using Agrobacterium 

was carried out essentially as described by Wang et al . 
1997. The plasmids containing the target DNA constructs 
containing the promoter-reporter gene fusions are shown in 
Figure 23. These plasmids were transformed into 
3 0 Agrobacterium tumefaciens AGL1 by electroporation . and 
cultured on selection plates of LB media containing 
rifampicillin (50 mg/L) and spectinomycin (50 mg/L) for 2 to 
3 days, and then gently suspended in 10 ml NB liquid medium 
containing 100 fiM acetosyringone and mixed well. Embryogenic 
35 rice calli (2 to 3 months old) derived from mature seeds 
were immersed in the A . tumefaciens AGL1 
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Table 6 

Comparison of the Intensities of Transient Expression 



Tissue 



Endosperm 

Embryo 

Leaf 



All 



pact_j 

s- 
gf g_no 

s 

10 
10 
10 



pwsssl pwsssl psbell psbell 



prolgf 
pNOT 

4 
5.5 
20 



pro2gf 
pNOT 
2.5 
5.5 
0 



prolgf 
pNOT 
3 . 5 
1.5 
10 



pro2gf 
pNOT 
1.5 
1 

10 



pZLGFP 
Not 



0.5 
0 
0 



intensities are relative to pac t_j s-gf g_nos transient 
expression in the target tissue 

Relative intensities were independently scored by three 
researchers and averaged. 
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suspension. After 3-10 minutes the A. tumefaciens AGL1 
suspension medium was removed, and the rice calli were 
transferred to NB medium containing 100 |±M acetosyringone 
for 48 h. The co-cultivated calli were washed with sterile 
5 Milli Q H 2 0 containing 150 mg/L timentin 7 times to remove 
all Agrobacterium, plated on to NB medium containing 150 
mg/L timentin and 3 0 mg/L hygromycin, and cultured for 3 to 
4 weeks. Newly-formed buds on the surface of rice calli were 
excised and plated onto NB Second Selection medium 

10 containing 150 mg/L timentin and 50 mg/L hygromycin. After 4 
weeks of proliferation calli were plated onto NB Pre- 
Regeneration medium containing 150 mg/L timentin and 50 mg/L 
hygromycin, and cultured for 2 weeks. The calli were then 
transferred on to NB-Regeneration medium containing 150 mg/L 

15 timentin and 50 mg/L hygromycin for 3 to 4 weeks. Once 

shooting occurs, shoots are transferred onto rooting medium 
(VS MS) containing 50 mg /L hygromycin. Once adequate root 
formation occurs, the seedlings are transferred to soil, 
grown in a misting chamber for 1-2 weeks, and grown to 

20 maturity in a containment glasshouse. 

Example 2 6 Use of probes from SSS I, SBE 1, SBE II and 

DBE sequences to identify null or altered 
alleles for use in breeding programmes 

25 DNA primer sets were designed to enable 

amplification of the first 9 introns of the SBE II gene 
using PCR. The design of the primer sets is illustrated in 
Figure 24. Primers were based on the wSBE II-D1 sequence 
(deduced from Figure 13b and Nair et al, 1997; Accession No. 

3 0 Y11282) and were designed such that intron sequences in the 
wSBE II sequence were amplified by PCR. These primer sets 
individually amplify the first 9 introns of SBE II. One 
primer (sr913F) contained a fluorescent label at the 5' end. 
Following amplification, the products were digested with the 

3 5 restriction enzyme Ddel and analysed using an ABI 377 DNA 
Sequencer with Genescan™ fragment analysis software. One 
primer set, for intron 5, was found to amplify products from 
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each of chromosomes 2A, 2B and 2D of wheat. This is shown in 
Figure 25, which illustrates results obtained with various 
wheat lines, and demonstrates that products from each of the 
wheat genomes from diverse wheats were amplified, and that 
5 therefore lines lacking the wSBEII gene on a specific 
chromosome could be readily identified. Lane (iii) 
illustrates the identification of the absence of the A 
genome wSBEII gene from the hexaploid wheat cultivar Chinese 
Spring ditelosomic line 2AS. 

10 Figure 26 compares results of amplification with 

an Intron 10 primer set for various nullisomic/ tetrasomic 
lines of the hexaploid wheat Chinese Spring. Fluorescent 
dUTP deoxynucleotides were included in the amplification 
reaction. Following amplification, the products were 

15 digested with the restriction enzyme Ddel and analysed using 
an ABI 3 77 DNA Sequencer with Genescan™ fragment analysis 
software. In lane (i) Chinese Spring ditelosomic line 2 AS , a 
300 base product is absent; in lane (ii) N2BT2A, a 204 base 
product is absent, and in lane (iii) N2DT2B a 191 base 

20 product is absent. These results demonstrate that the 
absence of specific wSBEII genes on each of the wheat 
chromosomes can be detected by this assay. Lines lacking 
wSBEII forms can be used as a parental line for breeding 
programmes for generation of new lines in which expression 

25 of SBE II is diminished or abolished, with consequent 

increase in amylose content of the wheat grain. Thus a high 
amylose wheat can be produced. 

Table 7 shows examples primers pairs for SBE I, 
SSS I and DBE I which can identify genes from individual 

30 wheat genomes and could therefore be used to identify lines 
containing null or altered alleles. Such tests could be used 
to enable the development of wheat lines carrying null 
mutations in each of the genomes for a specific gene (for 



WO 99/14314 



- 55 - 



PCT/£W5/00743 





ra 




<D 








<D 








ra 




•H 




01 




<D 




ith 




K 




0 




•H 




» 


<D 




rH 


o 






cd 


cd 




4-> 




CO 




M 




0 




M-C 




CQ 




U 








§ 




•rl 




U 




a* 




as 




o 




ft 



o 

O ft 
Q- w 



Pi — 



<D 
U 

d 

0) 

& 

09 

M 
0 

6 

•rl 
M 

■g 

<d 

§ 

fa 



H 0) 

g 6 

& -H 

O M 

fa 04 



Q 

o 

O LO 
O 
PQ V£) II 



II 



in 



w 

PQ 



in 



w 



o o 
mom 

^ LO VD 
- II II 

< m q 



in 
in 



CO 



o 
o 

O 

O 
< 
O 

U 

o 
to 

o 

a 

o 

Eh 

a 



■a 



O 

w 

to 

CO 

co 



CO 
CO 
CO 



II 



O LO 

o 
CQ ^ II 



4-> 

a 

o 3 
o < T3 



o 
o u 
a 0< 



CO 



in 

LD 



en 

i-H 

CO 
tS] 



w 

co 

CO 

co 



O O O M-I 

a\ <7\ -H 

t— I tH r— I I U 

- - - o a 

fO Q < S W U ttCN 



u 
o 



CO 



in 



CO 

in 

CO 



CQ 



W 
0Q 
Q 



in 



WO 99/14314 



PCl^P98/00743 



- 56 - 

example SBEI, SSI or DBE I) or combinations of null alleles 
for different genes. 

It will be apparent to the person skilled in the 
art that while the invention has been described in some 
5 detail for the purposes of clarity and understanding, 

various modifications and alterations to the embodiments and 
methods described herein may be made without departing from 
the scope of the inventive concept disclosed in this 
specification . 

0 Reference cited herein are listed on the following 

pages, and are incorporated herein by this reference. 
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SEQUENCE LISTING 

( I ) GENERAL INFORMATION: 



5 (i) APPLICANT: 

(A) NAME: COMMONWEALTH SCIENTIFIC AND INDUSTRIAL 
RESEARCH ORGANISATION 

(B) STREET: Limestone Avenue 

(C) CITY: Campbell 
10 (D) STATE: ACT 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2612 

(A) NAME: THE AUSTRALIAN NATIONAL UNIVERSITY 
1 5 (B) STREET: BRIAN LEWIS CRESCENT 

(C) CITY: ACTON 

(D) STATE: ACT 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2601 

20 

(A) NAME: GOODMAN FIELDER LIMITED 

(B) STREET: LEVEL 42, GROSVENOR PLACE 

(C) CITY: SYDNEY 

(D) STATE: NSW 

2 5 (E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2000 

(A) NAME: GROUPE LIMAGRAIN PACIFIC PTY LIMITED 

(B) STREET: LEVEL 31, I O'CONNELL STREET 

3 0 (C) CITY: SYDNEY 

(D) STATE: NSW 

(E) COUNTRY: AUSTRALIA 

(F) POSTAL CODE (ZIP): 2000 

3 5 (ii) TITLE OF INVENTION: REGULATION OF GENE EXPRESSION IN PLANTS 



(iii) NUMBER OF SEQUENCES: 17 



(iv) COMPUTER READABLE FORM: 
4 0 (A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1 .30 (EPO) 

4 5 (2) INFORMATION FOR SEQ ID NO: 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

5 0 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer based on the N-terminal sequence of wSBE I 5 ' end 
position 168 of SEQ ID NO:5" 

55 

(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

10 

GGC ACGCGAG AG ACTGG 1 7 

(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 (ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer in which 5 1 end is at position 1590 of SEQ ID NO:5 M 

(iii) HYPOTHETICAL: NO 

2 5 (iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

3 0 (A) ORGANISM: triticum tauschii 

(F) TISSUE TYPE: Endosperm 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

3 5 TAC ATTTCCT TGTCCATC A 1 9 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 18 base pairs 

4 0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

4 5 (A) DESCRIPTION: /desc = M pcr primer 5 1 end is at position 1 of SEQ ID NO:5" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

50 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 

5 5 (F) TISSUE TYPE: Endosperm 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
ATC ACG AG AG CTTGCTCA 1 8 

5 (2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "per primer 5 ' end is at position 334 of SEQ ID NO:5" 

1 5 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(v) FRAGMENT TYPE: 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

2 5 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CGGTACACAG TTGCGTCATT TTC 23 

(2) INFORMATION FOR SEQ ID NO: 5: 

3 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2687 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

40 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATCGACGAAG ATGCTCTGCC TCACCGCCCC CTCCTGCTCG CCATCTCTCC CGCCGCGCCC 60 
50 CTCCCGTCCC GCTGCTGACC GGCCCGGACC GGGGATTTCG GCCAAGAGCA AGTTCTCTGT 12 0 
TCCCGTGTCT GCGCCAAGAG ACTACACCAT GGCAACAGCT GAAGATGGTG TTGGCGACCT 180 
TCCGATATAC GATCTGGATC CGAAGTTTGC CGGCTTCAAG GAACACTTCA GTTATAGGAT 24 0 

55 

GAAAAAGTAC CTTGACCAGA AACATTCGAT TGAGAAGCAC GAGGGAGGCC TTGAAGAGTT 3 00 
CTCTAAAGGC TATTTGAAGT TTGGGATCAA CACAGAAAAT GACGCAACTG TGTACCGGGA 3 60 
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ATGGGCCCCT 


GCAGCAATGG 


ATGCACAACT 


T ATTGG TG AC 


TTCAACAACT 


GGAATGGCTC 


420 


5 


TGGGCACAGG 


ATGACAAAGG 


ATAATTATGG 


TGTTTGGTCA 


ATCAGGATTT 


CCCATGTCAA 


480 




TGGGAAACCT 


GCCATCCCCC 


ATAATTCCAA 


GGTTAAATTT 


CGATTTCACC 


GTGGAGATGG 


540 




ACTATGGGTC 


GATCGGGTTC 


CTGCATGGAT 


TCGTTATGCA 


ACTTTTGACG 


CCTCTAAATT 


600 


10 


TGGAGCTCCA 


TATGACGGTG 


TTCACTGGGA 


TCCACCTTCT 


GGTGAAAGGT 


ATGTGTTTAA 


660 




GCATCCTCGG 


CCTCGAAAGC 


CTGACGCTCC 


ACGTATTTAC 


GAGGCTCATG 


TGGGGATGAG 


720 


15 


TGGTGAGAGG 


CCTGAAGTAA 


GCACATACAG 


AGAATTTGCA 


GACAATGTGT 


TACCGCGCAT 


780 


AAAGGCAAAC 


AACTACAACA 


CAGTTCAGCT 


GATGGCAATC 


ATGGAACATT 


CCATATTATG 


840 




CTTCTTTTGG 


TACCATGTGA 


CGAATTTCTT 


CGCAGTTAGC 


AGCAGATCAG 


G AAC AC C AG A 


900 


20 


GGACCTCAAA 


TATCTTGTTG 


ACAAGGCACA 


TAGCTTAGGG 


TTGCGTGTTC 


TGATGGATGT 


960 




TGTCCATAGC 


CATGCGAGCA 


GTAATATGAC 


AGATGGTCTA 


AATGGCTATG 


ATGTTGGACA 


1020 


25 


AAACACACAG 


GAGTCCTATT 


TCCATACAGG 


AGAAAGGGGT 


TATCATAAAC 


TGTGGGATAG 


1080 


TCGCCTGTTC 


AACTATGCCA 


ATTGGGAGGT 


CTTACGGTAT 


CTTCTTTCTA 


ATCTGAGATA 


1140 




TTGGATGGAC 


GAATTCATGT 


TTGACGGCTT 


CCGATTTGAT 


GGAGTAACAT 


CC ATGC TATA 


1200 


30 


TAATCACCAT 


GGTATCAATA 


TGTCATTCGC 


TGGAAATTAC 


AAGGAATATT 


TTGGTTTGGA 


1260 




TACCGATGTA 


GATGCAGTTG 


TTTACATGAT 


GCTTGCGAAC 


CATTTAATGC 


AC AAAATC TT 


1320 


35 


GCCAGAAGCA 


ACTGTTGTTG 


CAGAAGATGT 


TTCAGGCATG 


CCAGTGCTTT 


GTCGGTCAGT 


1380 


TGATGAAGGT 


GGAGTAGGGT 


TTGACTATCG 


CCTTGCTATG 


GCTATTCCTG 


ATAGATGGAT 


1440 




TGACTACTTG 


AAGAACAAAG 


ATGACCTTGA 


ATGGTCAATG 


AGTGCAATAG 


CACATACTCT 


1500 


40 


GACCAACAGG 


AGATATACGG 


AAAAGTGCAT 


TGCATATGCT 


GAGAGCCACG 


ATCAGTCTAT 


1560 




TGTTGGCGAC 


AAGACTATGG 


CATTTCTCTT 


GATGGACAAG 


GAAATGTATA 


CTGGCATGTC 


1620 


45 


AGACTTGCAG 


CCTGCTTCAC 


C T AC AATTG A 


TCGTGGAATT 


GCACTTCAAA 


AGATGATTCA 


1680 


CTTCATCACC 


ATGGCCCTTG 


GAGGTGATGG 


CTACTTGAAT 


TTTATGGGTA 


ATGAGTTTGG 


1740 




CCACCCAGAA 


TGGATTGACT 


TTCCAAGAGA 


AGGCAACAAC 


TGGAGTTATG 


ATAAATGCAG 


1800 


50 


ACGCCAGTGG 


AGCCTCTCAG 


ACATTGATCA 


CCTACGATAC 


AAGTACATGA 


ACGCATTTGA 


1860 




TCAAGCAATG 


AATGCGCTCG 


ACGACAAGTT 


TTCCTTCCTA 


TCGTCATCAA 


AGCAGATTGT 


1920 


55 


CAGCGACATG 


AATGAGGAAA 


AGAAGATTAT 


TGTATTTGAA 


CGTGGAGATC 


TGGTCTTCGT 


1980 


CTTCAATTTT 


CATCCCAGTA 


AAACTTATnA 


TCZCZT'V A C A A A 




All ICjCC ICjGi 






GAAGTACAAG 


GTAGCTCTGG 


ACTCCGATGC 


TCTGATGTTT 


GGTGGACATG 


GAAGAGTGGC 


2100 


60 


CCAGTACAAC 


GATCACTTCA 


CGTCACCTGA 


AGGAGTACCA 


GGAGTACCTG 


AAACAAACTT 


2160 




CAACAACCGC 


CCTAATTCAT 


TCAAAGTCCT 


GTCTCCACCC 


CGCACTTGTG 


TGGCTTACTA 


2220 


65 


TCGCGTCGAG 


GAAAAAGCGG 


AAAAGCCTAA 


GGATGAAGGA 


GCTGCTTCTT 


GGGGCAAAGC 


2280 


TGCTCCTGGG 


TACATCGATG 


TTGAAGCCAC 


TCGTGTCAAA 


GACGCAGCAG 


ATGGTGAGGC 


2340 
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10 



40 



55 



- 65 - 

GACTTCTGGT TCCAAAAAGG CGTCTACAGG AGGTG AC TCC AGCAAGAAGG GAATTAACTT 2400 
TGTCTTCGGG TCACCTGACA AAGATAACAA ATAAGCACCA TATCAACGCT TGATCAGAAC 2460 
CGTGTACCGA CGTCCTTGTA ATATTCCTGC TATTGCTAGT AGTAGCAATA CTGTCAAACT 2 520 
GTGCAGACTT GAGATTCTGG CTTGGACTTT GCTGAGGTTA CCTACTATAT AGAAAGATAA 2580 
ATAAGAGGTG ATGGTGCGGG TCGAGTCCGG CTATATGTGC CAAATATGCG CCATCCCGAG 2 640 
TCCTCTGTCA TAAAGGAAGT TTCGGGCTTT CAGCCCAGAA TAAAAAA 2 6 87 



(2) INFORMATION FOR SEQ ID NO: 6: 
(i) SEQUENCE CHARACTERISTICS: 
1 5 (A) LENGTH: 807 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

2 0 * (ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

25 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

3 0 (ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1..807 

(D) OTHER INFORMATION :/label= sbel 
/note= "deduced amino acid sequence from SEQ ID NO:5 M 

35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Leu Cys Leu Thr Ala Pro Ser Cys Ser Pro Ser Leu Pro Pro Arg 
15 10 15 

Pro Ser Arg Pro Ala Ala Asp Arg Pro Gly Pro Gly lie Ser Ala Lys 
20 25 30 

Ser Lys Phe Ser Val Pro Val Ser Ala Pro Arg Asp Tyr Thr Met Ala 
45 35 40 45 

Thr Ala Glu Asp Gly Val Gly Asp Leu Pro lie Tyr Asp Leu Asp Pro 
50 55 60 

50 Lys Phe Ala Gly Phe Lys Glu His Phe Ser Tyr Arg Met Lys Lys Tyr 

65 70 75 80 



Leu Asp Gin Lys His Ser lie Glu Lys His Glu Gly Gly Leu Glu Glu 
85 90 95 

Phe Ser Lys Gly Tyr Leu Lys Phe Gly lie Asn Thr Glu Asn Asp Ala 
100 105 110 



Thr Val Tyr Arg Glu Trp Ala Pro Ala Ala Met Asp Ala Gin Leu lie 
60 115 120 125 
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Gly Asp Phe Asn Asn Trp Asn Gly Ser Gly His Arg Met Thr Lys Asp 

130 135 140 

Asn Tyr Gly Val Trp Ser lie Arg lie Ser His Val Asn Gly Lys Pro 

5 145 150 155 160 

Ala lie Pro His Asn Ser Lys Val Lys Phe Arg Phe His Arg Gly Asp 

165 170 175 

10 Gly Leu Trp Val Asp Arg Val Pro Ala Trp lie Arg Tyr Ala Thr Phe 

180 185 190 



15 



30 



45 



60 



Asp Ala Ser Lys Phe Gly Ala Pro Tyr Asp Gly Val His Trp Asp Pro 

195 200 205 

Pro Ser Gly Glu Arg Tyr Val Phe Lys His Pro Arg Pro Arg Lys Pro 

210 215 220 



Asp Ala Pro Arg lie Tyr Glu Ala His Val Gly Met Ser Gly Glu Arg 
20 225 230 235 240 

Pro Glu Val Ser Thr Tyr Arg Glu Phe Ala Asp Asn Val Leu Pro Arg 
245 250 255 

2 5 lie Lys Ala Asn Asn Tyr Asn Thr Val Gin Leu Met Ala lie Met Glu 

260 265 270 



His Ser lie Leu Cys Phe Phe Trp Tyr His Val Thr Asn Phe Phe Ala 
275 280 285 

Val Ser Ser Arg Ser Gly Thr Pro Glu Asp Leu Lys Tyr Leu Val Asp 
290 295 300 



Lys Ala His Ser Leu Gly Leu Arg Val Leu Met Asp Val Val His Ser 

35 305 310 315 320 

His Ala Ser Ser Asn Met Thr Asp Gly Leu Asn Gly Tyr Asp Val Gly 

325 330 335 

4 0 Gin Asn Thr Gin Glu Ser Tyr Phe His Thr Gly Glu Arg Gly Tyr His 

340 345 350 



Lys Leu Trp Asp Ser Arg Leu Phe Asn Tyr Ala Asn Trp Glu Val Leu 
355 360 365 

Arg Tyr Leu Leu Ser Asn Leu Arg Tyr Trp Met Asp Glu Phe Met Phe 

370 375 380 



Asp Gly Phe Arg Phe Asp Gly Val Thr Ser Met Leu Tyr Asn His His 
50 385 390 395 400 

Gly lie Asn Met Ser Phe Ala Gly Asn Tyr Lys Glu Tyr Phe Gly Leu 
405 410 415 

55 Asp Thr Asp Val Asp Ala Val Val Tyr Met Met Leu Ala Asn His Leu 

420 425 430 



Met His Lys lie Leu Pro Glu Ala Thr Val Val Ala Glu Asp Val Ser 
435 440 445 

Gly Met Pro Val Leu Cys Arg Ser Val Asp Glu Gly Gly Val Gly Phe 
450 455 460 
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Asp Tyr Arg Leu Ala Met Ala lie Pro Asp Arg Trp lie Asp Tyr Leu 
465 470 475 480 

Lys Asn Lys Asp Asp Leu Glu Trp Ser Met Ser Ala lie Ala His Thr 
5 485 490 495 

Leu Thr Asn Arg Arg Tyr Thr Glu Lys Cys lie Ala Tyr Ala Glu Ser 
500 505 510 

10 His Asp Gin Ser lie Val Gly Asp Lys Thr Met Ala Phe Leu Leu Met 

515 520 525 



15 



30 



45 



60 



Asp Lys Glu Met Tyr Thr Gly Met Ser Asp Leu Gin Pro Ala Ser Pro 

530 535 540 

Thr lie Asp Arg Gly lie Ala Leu Gin Lys Met lie His Phe lie Thr 

545 550 555 560 



Met Ala Leu Gly Gly Asp Gly Tyr Leu Asn Phe Met Gly Asn Glu Phe 
20 565 . 570 575 

Gly His Pro Glu Trp lie Asp Phe Pro Arg Glu Gly Asn Asn Trp Ser 
580 585 590 

2 5 Tyr Asp Lys Cys Arg Arg Gin Trp Ser Leu Ser Asp lie Asp His Leu 

595 600 605 



Arg Tyr Lys Tyr Met Asn Ala Phe Asp Gin Ala Met Asn Ala Leu Asp 
610 615 620 

Asp Lys Phe Ser Phe Leu Ser Ser Ser Lys Gin lie Val Ser Asp Met 

625 630 635 640 



Asn Glu Glu Lys Lys lie lie Val Phe Glu Arg Gly Asp Leu Val Phe 
35 645 650 655 

Val Phe Asn Phe His Pro Ser Lys Thr Tyr Asp Gly Tyr Lys Val Gly 

660 665 670 

40 Cys Asp Leu Pro Gly Lys Tyr Lys Val Ala Leu Asp Ser Asp Ala Leu 

675 680 685 



Met Phe Gly Gly His Gly Arg Val Ala Gin Tyr Asn Asp His Phe Thr 

690 695 700 

Ser Pro Glu Gly Val Pro Gly Val Pro Glu Thr Asn Phe Asn Asn Arg 

705 710 715 720 



Pro Asn Ser Phe Lys Val Leu Ser Pro Pro Arg Thr Cys Val Ala Tyr 

50 725 730 735 

Tyr Arg Val Glu Glu Lys Ala Glu Lys Pro Lys Asp Glu Gly Ala Ala 

740 745 750 

55 Ser Trp Gly Lys Ala Ala Pro Gly Tyr lie Asp Val Glu Ala Thr Arg 

755 760 765 



Val Lys Asp Ala Ala Asp Gly Glu Ala Thr Ser Gly Ser Lys Lys Ala 

770 775 780 

Ser Thr Gly Gly Asp Ser Ser Lys Lys Gly lie Asn Phe Val Phe Gly 

785 790 795 800 
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Ser Pro Asp Lys Asp Asn Lys 
805 

(2) INFORMATION FOR SEQ ID NO: 7; 
5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 319 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

10 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

15 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

20 

(ix) FEATURE: 

(A) NAME/KEY: misc_signal 

(B) LOCATIONS. .3 19 

(D) OTHER INFORM ATION:/function= "3' untranslated region 
25 of wSBE I-D4 cDNA" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

GCGACTTCTG GTTCCAAAAA GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC 60 

30 

TTTGTCTTCG GGTCACCTGA CAAAGATAAC AAATAAGCAC CATATCAACG CTTGATCAGA 12 0 
ACCGTGTACC GACGTCCTTG TAATATTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA 180 

3 5 CTGTGCAGAC TTGAGATTCT GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT 2 40 

AAATAAGAGG TGATGGTGCG GGTCGAGTCC GGCTATATGT GCCAAATATG CGCCATCCCG 3 00 
AGTCCTCTGT CATAAAGGA 319 

40 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4890 base pairs 

(B) TYPE: nucleic acid 

4 5 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

5 0 (iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
5 5 (A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 



(ix) FEATURE: 
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(A) NAME/KEY: promoter 

(B) LOCATION: 1.. 4890 

(D) OTHER INFORM ATION:/function= 'promoter containing 
sequence of SBE I" 

5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GGGTGGCGGG TCGGGCGGCA AGGCGCGGGG CGGCGGGGCG GCCGGGGCGG CGCGGCGGCG 60 

10 CGGGCGGCAG CGGCGGCTAG GGTTTCGCGG CGGCGGCGAC TTGGGCTGAG GCGGGGCACG 12 0 

GGCTGCGGCT TTAAAGGCCG GCCAGGCTGA GGTGTCCGGG TCGGACACGG CCCGTAAGGC 180 

GGTTGACTTT AAAAAATAAT AATTCGGACA TGCAAAAAAG TAAGAAAAGA AATAATAAAC 240 

15 

GGACTCCAAA AATCCCGAAG TAAATTTTTC CCCATTCTTA AAAATAAGCC GGACAAGATG 3 00 

AACATTTATT TGGGCCTAAA ATGCAATTTT GAAAAATGCG TATTTTTCCT AATTCGGAAT 3 60 

2 0 AAAATCAAAT AAAATCCAAA TAAAATCAAA TATTTGTTTT TAATATTTTT CCTCCAATAT 42 0 

TTCATTATTT GTGAAGAAGT CATTTTATCC CATCTCATAT ATTTTGATAT GAAATATTTT 480 

CGGAGAGAAA AATAATTAAA ACAAATGATC CTATTTTCAA AATTTGAGAA AACCCAAATA 54 0 

25 

TGAAAATAAC GAAATCCCCA ACTCTCTCCG TGGGTCCTTG AGTTGCGTGA AATTTCTAGG 600 

ATCACAAATC AAAATGCAAT AAAATATGAT ATGCATGATG ATCTAATGTA TAACATTCCA 660 

3 0 ATTGAAAATT TGGGATGTTA CATATAACTC AAATTCTATA ATTATGAACA CAGAAATATT 72 0 

AATGTAGAAC TCTATTTTGT TTTGAAATTG TATTATTTTT TAGAATTAGT CTAGAGCATT 7 80 

TCGTGAACTT GAATCAAACC TTTAAATAAA ACAAAGCATA AAAATGACAA ATTCACATAT 840 

35 

GAAATAACTT GTGTTACATA GATTTATTAC AATAGCGTTG TATGTGTGTA TGTGTGCGTG 9 00 

AGTGCCTATG GTAATATCAA TAAATATCTT GATAGATGTT TCTACAATTC ACGGGTCTAA 960 

4 0 CTAGTAATGC AATGCAATGC ATGCTAAAAG AATAGAACCT TAGTTTCATT TAACTAACAA 102 0 

TTTTCAAATG TATGAGTTGC CAACAAGTGG CATACTTGGC ACTGTTTGTT TGTTCATTTT 1080 

ATGGAAAGTT CTTCTCTTTT TACATGGTTT AGATTCCAGC ATGTAGCCAC AAAATATGAT 1140 

45 

TGTCAAAAGA TAATACCTCA TAATACAATT CCACTAAAGT CACCTAGCCC AAGTGACCGA 1200 

CCTGATCCTG AAATAAAATC AGAAGATTTG GTGTCATCAT CATGACAACA AATT ATT AG G 12 60 

5 0 CGGTAGATCT TGTGGTAGTA CTCATGATGT AAAATTATCA AGAGGGAGAG AATGTATGGA 13 20 

GATTTATGTG AAGTACATCG T AC AC C AG AC ATAGTTGACA CATCGATTTT TT AAG AT AC A 13 80 

TTTGGACGCG CCTTGTGGGA GTGTAAAGTA CTACCATGTA TTAGAAGAGG TGAAATGAGA 1440 

55 

AATGCCATAG CTAGCAAGTA GGCCTAGTTA AGGAAATTCT TCCTTAGATC CCCTTCTCCC 1500 

GAAGAGTGAA GTGCTTCAAC TAAAGGTTAG ACCCACTTAA AAAATGTCAC TTTGAATCTT 1560 

6 0 TGCTTCCCTT GTCGTAATCC TGTGCATTTG TAGGTCCCTC GGATCTGAGC CCTTTCTCCA 162 0 

AGCCCTTCAT TGGATTCCCC TGGATGTCTT TTTGTTACAT TTTATTGAAG TGAGAGTGAA 16 80 

TTATTATATG CCCATAGGAG GTGGGATATA AAGGCTGTTG GTATTCTGCA CCATACATGC 17 40 

65 
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TAGAGTAGGG AGGAGAGGCT GGTGCATGAT ACATGGTGGA CTAGCCCATA TATTTACCCC 1800 

TCCCCCACCC ACTAACAAGT TTTTTTTATT AGGTCTTCAT CCTCTGATTT GTTTTTCTGT 18 60 

5 TAGCCCATTC TTCATCATGG AC TTATTAAT CATGATTAGT TTCTTGGATT TTTGTTTACT 192 0 

TGACTTGAAT TTGACAATGT GCCTCATATA TGGCATGTGG GACTGATAGG AAGATATATT 19 80 

CTCACAACAT TAACTTAAAA AGGATTATTT TTTTGGTGCA GTCGTAAAGA AAACTACTTT 2 04 0 

10 

CTTTTATGCT AAAAGTTATT CAAACATAGA TTTATAAACA AAGGATATCA CCATGCATGA 2100 

CCATGCGCTC TCTCATGTTT AC TC TAG AAA CCATATATCT CTTTGTTGCA AAATATTTAA 2160 

15 TCTATCCTCC TTGTTTCTGG GAATGAGTCG GGGAAGGTAA TCTTAGGGAA GGTTAAAGTG 222 0 

AGGCAAGTAA GAGCAACTCT AGCAGAGTCG CGATATGCCC AATCGCCATA ATGCCAATAT 22 8 0 

GGCATTTTTG GCCCAAAATG GCACTTCAGA AGAGTCACCA TATCCCTTCG GATAGCCATA 23 40 

20 

ATTTAGGGAG CTCGCTCCAC AAACAAGCTT CGAGCCTCCA AATATGGAGG CCATGGATTC 24 00 

GTTGTTTGGC ACTCACTCCA TATCCAACCG CAAGCGCATG CATGAGGGAA GTTTTAGCTT 246 0 

2 5 CTTCCTCCTT GCGCCAACGC CGGGATTTTA CACAGCGCAT TACAGGTACA TGAACCAGCA 2 520 

TGC AC AG AT A ATCACCGACG AGTGGGGTGA CAAGAAGGAT AAGCACCCTC CCATTAGTGG 2 5 80 

TGCGCCCACT CCCCTCAAAT TCATGAGGCA GCCATTTGGA TGGTCATCGC GTGGtATAAG 2 640 

30 

CTCCGACTAT AAAATCTCAA CGGCATCACC AAAACCATAG CTGCCGCCTC CCCCTTCCTC 27 00 

GGCATCACCT CCCCAAGACA TCTCCTCCCC TCTATGCCAC AATGTCATCA TTATGGAGAG 2 7 60 

35 ACACAACTAC TGGTAAACCG CATACCCAAT CATGGTTTAC CGGCAGTGCG AACCCCACCT 2 82 0 

TCCTCCCACG ATGGTAGGAT ATTCTCCTCC TAGAATGGCG CGTGTGGCGC TTCCTCCTCC 2 8 80 

CGAGGCTGAT ATGTCGGCTC CCATGATGGC GTGCATCATT GATTTGGCGC TTCGGGTCCA 2 940 

40 

TCATACATGT TAACGAGGTC ATCCCCATTG ATGTCGTTGG TCCCCTTGCC CCCCAGTCGG 3 0 00 

ATCCTGAGGA CCCGTTCGAT GTCGCAATGC GACTCTCCAA ACTCAAAGCT CACAATGAGG 3 0 60 

45 AGTACGTCCT CTAGGAGTTC CGCCCCGCAA CCATCTATAA GGAGGAGCAA CGATAGCTCT 312 0 

CCCCTACGCC TTCCTCGACG ATCTCTCTTA GGAGGACAAC GGCTAGACGA CGGCGGCGGC 3180 

GGCGAAGGTA CTGCAGGTAG TAGAACATAG CAATGTCGAA TGGCGACATT GCATATTTTG 32 4 0 

50 

AAAATGTCGC TCAACGACTT TTGAAGTCGC AAATAAAATG TAGTGTGACT ACTTTTGGCC 33 00 

AGCAATATAA GTTT ATC AC A TTTGATAATG ATTTGAACCG GTGTGGTTCA ACTAAATGTA 3 3 60 

55 CCATAAATTG AACATACAAA TTTTTAGCAA ATGAAAAAAG AAACAAGTAA GACCACAAAT 3 4 20 

ATGAAAGCCG CATATCGCGA CTATGTGTTT GAGCCGCAGC TGCCAAGTAC ATATGAAGCG 34 80 

TACTCCATAT GACATACGAC AACCATACAT ATGAAGACTC TACTAGAGTT CTCTAAGGCC 3 5 40 

60 

GCTTTTAGCG CCTTTCGTGC AGTGGTGCCC ATAGGGAGTG AGGGTAGTTG GACTGTTCGT 3 600 

TTCCCCTTTT TTCATTTCTT TGAAATCTAT TTTATTTTTT TTCTCTTTTG TAGGTTTCCC 3 6 60 

65 AAATTTATAT ACCATTTTTC TGTTTCTCGC TATTTTTTGT TGTTATATTC TAGTTTCATA 372 0 

TTTTTCTATT ATTAATTTGT GTCTCTTATG AGAAGTCCAG ACTTGCATAT GGAGGTGCAC 3 7 80 
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ACACAAACAT 


AT AAAG T AT A 


AATACTAAPT 


TPAPAAP.TAT 


U 1 1 1 V_l\_, Vj X \J\J 


TP AAAAAAAP 
1 v_. AAAAAAAv— 


3 o *± \J 


ATCATCAAAA 


CCTGCCAATA 


TGAGATATAP 


TTTTflAATAT 

X X X X OAVn X X 


ATP A ATATP& 
f\ X V— AA X A 1 on 


PP A A PPP A A C 


Q Pi Pt 


CATTTAAAAT 


GTGAACAATT 


GTTTTTTTAP* 

X X X X X X X £±\J 


AAAAAATATA 


AP,A A ATA APT 
nunAn X AA\«- X 


PP A APPP APP 
LLAALLLAbL 


Q Pi 


CAAACCACAT 


GCTATAPAPT 

vj\— x x *vv — rvv^ x 


TGCTPCATAT 

X VJ X x— xi X xx X 


P.AAAPPATGT 

vjrvrvriv* v— n x v_j x 


TTPPT A TTPP 
X X w v_. X A X X vJU 


PPAPTTPPPT 
vjv_ Avj 1 1 bH, 1 


yl Pi O P| 
*i vj Z U 


G AAAC C G AAA 


GTAATGTTAG 

W X il-ti. X X. X AiV_J 


TCGTTTTTCT 

V— \— V7 X X X X X \_» X 


ATTPAAAGAA 


f,AAr;P,APAPT 

Ort/"\Vjv_TrtV_7rtV_i 1 


PP A PPTP A PP 


A PiO Pi 
4i U t3 U 


CG ATGC TTAG 


APGTGAGATG 

•tAV_ VJ X VJrt-VJrt. X \_t 


GGGATGAPPA 


PAAPP,TPPPT 
^-rtJlv-Vjj X v_ v_ v_ 1 


APAPAPAPPT 


C A C^C'C^C* AT AT 

LAL v- ubAbA 1 


41*tU 


GGGGACATTG 


PAGTTGAPAP 




a p pp, p. p t p c a 




bbLAALA 1 Vj7 1 


/i o n n 


GGCGAGGCGG 


APPTPHPifiPT 
<r_v_\j x v_ v_i\jv_j v_ x 


VJ\jV_ftUvJ X /t. W V_J 




a a pp a rpppr 


/^/^ A A A O A A 


>1 o c n 
4_bU 


GAGGAGTAGP 


PTHP A A A A P A 


TGGTAPAPPA 


PTTTTPTPPP 


PT A PP A A A A P" 


L. 1 L.A I 1 lCA 1 


4 j_U 


TCCGGPAPPG 

X V_ \_ \_ \_ \_. i*N_. \_. V— 




P A APP A APP A 

V— XTtrt \— V— /"_rt\_ V_ rt 


X v_.\_iv_rt\j 1 V_\_l_ 


ALA 1 \j x v_.v_.L- X 


L. 1 I Q, I x ICj 


y( "J Oft 


fAAAAARTAA 

V — /iXl/^rV/V\J X _*_rt 


rp rp rnrp <-» rp rp 


Tfl^APAPPPP 


A A BPfiPTJ A A 

n/inbHo 1 AAA 


Lilllbii Avj 


TTTTCATTTC 


444 0 


TAG AAA AAHP 


A A r PPP , P'l ,, f ,r P A 
rtA. 1 1 X X X A 


T* A P HTTP r P r P r P r P 
X Avj 1 1L 1 X XX 


PqipA A Afma A 
Vj X u/vi/iu 1 Art 


1 vjL. Ill 1A1A 




/ r n rv 


TflT TP TTTT A 

-L Vjr X X v_ X X X X 


HAl^rAAATAT 
v_».r\.vj\ .rt_r\_rt. X A X 


rri rTi ^rri rri ryi rTi rT^iTi 
V_.XXl— XXXXXX 


X X X X ALtvj\_xAA 


AALtAvjU AAA 1 


A 1L. 1 I (_ (_ AC I 


/i CCA 


TTTP APiAAA 
X x X Vwr"_v_. rvArvrt 


v_ X \_» AV_.VjAA\_t\_» 


ptvi a a aptpp 


pp ap ar aptp 

CIjAvjALA^ x vj 


Al_»t_tL7v_t_.C_.A 1 A 


P* rp rp rp /— • /~» rn /-» 

GCTTTCGTCC 


/l _T O Pi 


V_» \J V_ V— V— .rtAjk- \J\J 


ppp a or* a p pp 

*j L. AC Vj AC V— 


t p p a pp tp p a 

X l_-C AV_Vj 1 v_rv_ A 


CCCCvjtjCCC 1 


L. V_ (j \j Cj L. C C G 


CAGATCCGTT 


/icon 
4 6 3 0 


CTCCCTCGCC 


CCCGTTTCCC 


CCTCCCTCCC 


TCTCGTTGCT 


TCCACTCCAC 


TGTTCTCCTC 


4740 


TTCCTGTCCA 


AAGCGGCCAC 


GGACCGGAAA 


AAAATCACGC 


CTTTCCGTTG 


GGTCTCCGGC 


4800 


GCCACACTCC 


TCCTCCGGCC 


GATATAAAGC 


GCGCGGGGCC 


ACGGGCCCGG 


CGCAAAATGG 


4860 


GATTCCCGTC 


CGCCGCCATG 


GAGGAAGATG 


4890 







10 



15 



20 



25 



30 



35 



4 0 (2) INFORMATION FOR SEQ ID NO: 9: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6228 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
4 5 (D) TOPOLOGY: linear 



MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 



(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 

5 5 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1 

6 0 (D) OTHER INFORMATION :/product= "coding region of wSBE I-D4 gene" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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ACGGGCCCGG CGCAAAATGG GATTCCCGTC CGCCGCCATC G AC G AAGATG CTCTGCCTCA 60 
CCGCCCCCTC CTGCTCGCCA TCTCTCCCGC CGCGCCCCTC CCGTCCCGCT GCTGACCGGC 12 0 
5 CCGGACCGGG GATCTCGGTG AGTCAGTCGG GATCTTCATT TCTTTTCTTT TCTTTCGTTT 180 
CCGGCTCCGT TCTGCCGGGG TTTCCCTGAT GCGATGCCGC GCGCGCGCAG GGCGGCGGCA 240 
ATGTGCGGCT GAGCGCGGTG CCCGCGCCCT CTTCGCTCCG CTGGTCGTGG CCGCGGAAGG 3 00 

10 

TGAGCCCTCT CCCCTGTCTA CCCAGATTTG CGACCGTGAT CCCCTGTTGT CGCCGGGCAA 3 60 
ACGGAATCTG ATCCACGGTG GTTATTGGAA ATAGTATATA CTACTAATAA ACTTGAGGCT 42 0 
15 GGGATTCGTC CACTGAGGAA CAAGTGGATG CGATTTCGAT TGGATTTCTC TGCTTTATGC 4 80 
GATCCGTACG CAGAATATCC CTCCTGCAGT GTCTCAACCG TATTACTGGA TGTACAACCC 540 
AAATGTGTAT AATCTGTGCT GAATGTATCA ACCAATAATT GCTGCATTGT GAAAACATAA 600 

20 

TCCTGTGTTG TGTCTCTACT ACTTGTTCAG TCCTGATCTG CCGCTTATCC TAACTTTTGT 660 
TCATTTATGG AAGGCCAAGA GCAAGTTCTC TGTTCCCGTG TCTGCGCCAA GAGACTACAC 72 0 

2 5 CATGGCAACA GCTGAAGATG GTGTTGGCGA CCTTCCGATA TACGATCTGG ATCCGAAGTT 780 

TGCCGGCTTC AAGGAACACT TCAGTTATAG GATGAAAAAG TACCTTGACC AGAAACATTC 840 
GATTGAGAAG CACGAGGGAG GCCTTGAAGA GTTCTCTAAA GGTTAGCTTT TGTTTCATGT 900 

30 

GTTTGAAACA ATAGTTACAT CTTGTGGCGT CCGCAGCACA AAAGACATAA TGCGACTCTG 960 
TTTTGTAGGC TATTTGAAGT TTGGGATCAA CACAGAAAAT GACGCAACTG TGTACCGGGA 102 0 

3 5 ATGGGCCCCT GCAGCAATGT AAGTTCTAGT GTTGTCACGC AACTAATTGC AATGGTCGTT 1080 

GGTTAACTTA TGAAGTGCTG ATGAAACTGT CTTAAGAGTT TATGGCTTGT CTTTTCTGAT 1140 
TCTAGCTAGT AAAGAGTAGA TAAATATGAA ATATGTTTTC CCTTTTCTAG TTATGGTCAT 1200 

40 

GGTTGGCTGG TATTCATTTC TTTTATGGCA ATACTTGCTT CTAACTATCT TTAGTAGATT 1260 
CATGTATTTA CTTGTGAGTC ATTACTTTAT GGGTGTAGGG ATGCACAACT TATTGGTGAC 132 0 
45 TTCAACAACT GGAATGGCTC TGGGCACAGG ATGACAAAGG ATAATTATGG TGTTTGGTCA 13 8 0 
ATCAGGATTT CCCATGTCAA TGGGAAACCT GCCATCCCCC ATAATTCCAA GGTTAAATTT 144 0 
CGATTTCACC GTGGAGATGG AC T ATGGGTC GATCGGGTTC CTGCATGGAT TCGTTATGCA 1500 

50 

AC TTTTG ATG CCTCTAAATT TGGAGCTCCA TATGACGGTG TTCACTGGGA TCCACCTTCT 1560 
GGTGAAAGGT CTACTTTTAG TGGCTCGAGA GCAAGAAATC TAAGTAAAAC CCACACAATT 162 0 
55 AACTTACATT AATGTGGAGA CATGATACTT TTATTGCTCG TTTTGCAGGT ATGTGTTTAA 1680 
GCATCCTCGG CCTCGAAAGC CTGACGCTCC ACGTATTTAC GAGGCTCATG TGGGGATGAG 174 0 
TGGTGAAAAG CCTGAAGTAA GCACATACAG AGAATTTGCA GACAATGTGT TACCGCGCAT 1800 

60 

AAAGGCAAAC AACTACAACA CAGTTCAGCT GATGGCAATC ATGGAACATT CATATTATGC 1860 
TTCTTTTGGG TACCATGTGA CGAATTTCTT CGCAGTTAGC AGCAGATCAG AACGCCAGAG 1920 
65 ACCTCAATAT CTTGTTGACA AGGCACATAG TTTACGGTTG CGTGTTCTGA TGGATGTTGT 1980 
CCATAGCCAT GCGAGCAGTA ATAAGACAGA TGGTCTTAAT GGCTATGATG TTGGGCAAAA 2040 
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CACACAGGAG TCCTATTTCC ACACAGGAGA AAGGGGCTAT CATAAACTGT GGGATAGCCG 2100 
CCTGTTCAAC TATGCCAATT GGGAGTCTTA CGATTTCTTC TTTCTAATCT GAGATATTGG 2160 

5 

ATGGACGAAT TCATGTTTGA TGGCTTCCGA TTTGATGGGG TAACATCCAT GCTATATAAT 222 0 
CACCATGGTA TCAATATGTC ATTCGCTGGA AGTTACAAGG AATATTTTGG TTTGGATACT 2280 
.10 GATGTAGATG CAGTTGTTTA CCTGATGCTT GCGAACCATT TAATGCACAA ACTCTTGCCA 23 40 
GAAGCAACTG TTGTTGCAGA AGATGTTTCA GGCATGCCAG TGCTTTGTCG GTCAGTTGAT 2400 
GAAGGTGGAG TAGGGTTTGA CTATCGCCTG GCTATGGCTA T TC C TG AT AG ATGGATCGAC 2 460 

15 

TACTTGAAGA ACAAAGATGA CCTTGAATGG TCAATGAGTG GAATAGCACA TACTCTGACC 2 52 0 
AACAGGAGAT ATACGGAAAA GTGCATTGCA TATGCTGAGA GCCATGATCA GGTATGTTTT 2 580 

2 0 CCCTCCTTTG TCGCTGTGCG TGAGTATGTG TTCTTTTTTT ATGGGGCACT GGTCTAAGAA 2 640 

CATACAGTTC AAAGGTGAGA CACTTTCTTT GCCTGGTAGA CAAATTTGAG AAATAAACAT 2 700 
TTCGCTTGAT GACTTTTAGT TGCTTCACAA GTTCGAATTA AGTTAGTTAT ATTCTGATAA 27 6 0 

25 

CTAGTGATAG TACCCACTAA CCAGCTATTA CGGACCATGT AAGAATGTCC GAAGACTGCA 2 82 0 
GTTATATATC GTTGACTTTG TGTTCATCTA TTGAAACAAC TTAGTAGTTA ACTTTCACGC 2 8 80 

3 0 AAATTTTCAG TCTATTGTTG GCGACAAGAC TATGGCATTT CTCTTGATGG ACAAGGAAAT 2 940 

GTATACTGGC ATGTCAGACT TGCAGCCTGC TTCGCCTACA ATTGATCGTG GAATTGCACT 3 000 
TCAAAAGGTT CGATTCGTTT TAAGTATTCC TGAATTTGAT GTTCTAGTTC CAGACGAGTA 3 060 

35 

TTGTAATGTT CGTTGTTACT C AGAGTTC TG CTTAGTCCTT GAAGATAATG TATTCCAGTC 312 0 
CCTTTTGGTA CATTTGGCTT ATTTTGTTAC AAATATTTCA GATGATTCAC TTCATCACCA 3180 
40 TGGCCCTTGG AGGTGATGGC TACTTGAATT TTATGGGTAA TGAGGTAATA TCTGGTTATC 3 240 
TGTCAAAACT TATTTCTGAT CAATATGTTT CGGGATTCCC TCGAAAAAAA TCCTTTGGGC 3 3 00 
AGGGCGAAAA GTTTAAACAT CTGTTTTCTA TGATAGCCAA GTACTCCCCA GCTATTTCCA 3 3 60 

45 

TGTTATCACG TATCATTTAG CTGTGCCGGT AGTTAATCTT TATTCTAATT CATTGTTGTT 3 42 0 
TTTTAGCGTG GCAGTCTATT GTTGGATCCT CTTATTCCAA TTACATATAT GCCGACATCA 3 480 
50 CACACTTATG AATATTCCCT GTTTAAAAGA TTTTTATTTT ATACCAATGT TTCTCCGTAA 3 54 0 
ATG ATG C AAA CATGATAGAG ATGTTAGCAT GTCTTTCTTA ACCTACTCAT GTTTTACATA 3 600 
TCACGACAAG CTTCTTGCAG AAAATCAGCA G T AT ATGGC A AATTGCTGCA ACCTGACAAC 3 660 

55 

GTTTATATCT GTTTTCTAAC TCATACTGAC GGTGCAATTT CCTTTTAGTT TGGCCACCCA 372 0 
GAATGGATTG ACTTTCCAGA AGAAGGCAAC AACTGGAGTT ATGATAAATG CAGACGCCAG 37 80 
6 0 TGGAGCCTCG CAGACATTGA TCACCTACGA TACAAGGTTA TG C C T ATG T A TATTTTTACA 3 840 
GTTTCTGGTC TGGTAGCTCT CTTGGGATCT TGACCTCACT TAGTTCCTTC ATCTCTGACT 3 900 
GTAGCTTATT TACACTGTGT TCCAACTTCT GTCTTGTGGA TAAATTCTCC CTTC TAACGT 3 9 60 
TTCATATTAA GCCTTTCAAA CTAAACTAAA TTGCTGATCT ACTACTAGTT GCTCAGTACG 402 0 



65 
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ATGACCAAAT CTTGCCTGTG GTAACCTAGT AATTTTC TTG ATTCTTACAC ATTAGTGATA 40 80 
TGCAGTGCAT ACATTATCCA TATAAATTGA CATTGCAATT TCCCAAATAT TATTTGAAGG 4140 
5 CTGTGTTCTT TTGTTAACAG GAAGTTATTT TCTCTGCATC TGATAAATAA TAATAGCCTT 42 0 0 
TCACGATTTT TCTCATATTT TATCCAACTT TTCTGCATTC AAGCATTTTT TGTTTC TCGC 42 60 
C T AAC AT AT A TAATTTGAAC AGTACATGAA CGCATTTGAT CAAGCAATGA ATGCGC TCG A 43 20 

10 

CGACAAATTT TCCTTCCTAT CATCATCAAA GCAGATTGTC AGCGACATGA ATGAGGAAAA 43 8 0 
GAAGTAGTTA ACTATACAAT GTTTAGTCAG GGCAGCTGTT GCATCATTTG ATTCACTCCT 4440 
15 ACTCTTAAGA ATAGCAACTC TGACTTGTGC GTTTTATGTT ACCAAATAAG TTGAAACCGT 4 500 
ATC TGTTTG A TATGAACCAT TGTTGTCTCA AAATGGGCTA TGGACTCAAT CCAACTTCCT 45 6 0 
TTCCAGATTA TTGTATTTGA ACGTGGAATC TGGTCTTCGT CTTCAATTTT CATCCCAGTA 462 0 

20 

AAACTTATGA TGGGTAACTG ATCTCTTGCA AGCTTTGCCT TTCAATATTT CTTCTGCTTA 4680 
ATGACTAATG TGCTTAATCT CGTTTCCACT TTTAAAACAC GCAGTTACAA AGTCGGATGT 474 0 

2 5 GACTTGCCTG GGAAGTACAA GGTAGCTCTG GACTCTGATG CTCTGATGTT TGGTGGACAT 4800 

GGAAGAGTAA GCAATGTTAA TGATGTTCAA GATCTGTTTT GCAACACTAT GTTCTTCTAT 4860 
AGAAGGGGCC ATCAAGGCTG CATCAGATAA TCTTATTTGC AGTGTTGATC TGTGCTGCAT 492 0 

30 

CGCAGGTGGC CCATGACAAC GATCACTTTA CGTCACCTGA AGGAGTACCA GGAGTACCTG 49 80 
AAACAAACTT CAACAACCGC CCTAACTCAT TCAAAATCCT GTCTCCATCC CGCACTTGTG 5040 

3 5 TGGTAATGCT AATT AC TAG G AGGATTTAGT AACAATAAAT AAATAACAGC AAAAGATATC 5100 

TGCAGTACGA TCTCACAAAA TGCTCTCTTG CCAGGCTTAC TATCGCGTCG AGGAGAAAGC 5160 
GGAAAAGCCC AAGGATGAAG GAGCTGCTTT CTTGGGGGAA ACTGCTCTCG GGTACATCGA 522 0 

40 

TGTTGAAGCC ACTGGCGTCA AAGACGCAGC AGATGGTGAG GCGACTTCTG GTTCCGAAAA 52 8 0 
GGCGTCTACA GGAGGTGACT CCAGCAAGAA GGGAATTAAC TTTGTCTTTC TGTCACCCGA 53 4 0 
45 CAAAGACAAC AAATAAGCAC CATATCAACG CTTGATCAGG ACCGTGTGCC GACGTCCTTG 5400 
TAATACTCCT GCTATTGCTA GTAGTAGCAA TACTGTCAAA CTGTGCAGAC TTG AAATTC T 54 60 
GGCTTGGACT TTGCTGAGGT TACCTACTAT ATAGAAAGAT AAATAAGCGG TGATGGTGCG 552 0 

50 

GGTCGAGTCC AGCTATATGT GCCAAATATG CGCCATCCCG AGTCCTCTGT CATAAAGAAA 558 0 
GTTTCGGGCT TCCATCCCAG AATAAAAACA GTTGTCTGTT TGCAATTTCT TTTTGTC TTG 564 0 
55 CATAGTTACA TGATAATTGA TGCATATTGC TATAAGCCTG GATTGCATCT TCTTTTGCTA 57 00 
ATAACTGCAG GGCCAAGAAA GCCTAGATTG TATCTTTTTT TGCTAATAAC TGCAGTGCTG 57 6 0 
GGGAAGCTTC AGTCCTTGTT TCCGTTCTCG AGACAAGGCG TCATGTTTGG CGCACAAAGG 5820 

60 

TAAGCCATCA TCTTATCAAG TCCCAAAATT CTCTGGTTGA AAGAAACCAT CACTAACTTG 58 80 
TTCCAGGTGT TGGTTCCTCC ACAACCAAAA GGCGACCATC GTCGTCATCA TCGCTCACAG 5 94 0 
6 5 CACTGACCAT CGAAGCCACG GTGGGCATGA AATGCGCATC GCCCAAGACT TGGGACCGTT 6000 
TCAAAATATC ACAAACTGCC ATGGCATCTT CTGCCAAAGG CTGCACTGCA CCTTTGGCAT 6060 
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GAACAGAAGC AACAGGGGCT TGGAACTGAA CGCCGAAAAT AAAGTCAAAC CGGCTGGGCC 6120 

GGATTGAAAG GGGAAACGCC AAAATCCACT TAATTTGAAT GGAAGGAGGA ATGGTT CTTG 6180 

5 

CTGGTTTCAA CTCTGCAGGC TTCCCTCTGA ATTTCACACG GAGCCATT 622 8 

(2) INFORMATION FOR SEQ ID NO: 10: 
(i) SEQUENCE CHARACTERISTICS: 
1 0 (A) LENGTH: 1 1 463 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

1 5 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: 

20 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

25 (ix) FEATURE: 

(A) NAME/KEY: misc.feature 

(B ) LOCATION: 1 . . 1 1 463 

(D) OTHER INFORMATION:/product= "complete sequence of the 
starch branching enzyme II gene" 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 





AGAAACACCT 


CCATTTTAGA 


TTTTTTTTTT 


GTTC TTTTCG 


GACGGTGGGT 


CGTGGAGAGA 


60 


35 


TTAGCGTCTA 


GTTTTCTTAA 


AAGAACAGGC 


CATTTAGGCC 


CTGCTTTACA 


AAAGGCTCAA 


120 




CCAGTCCAAA 


ACGTCTGCTA 


GGATC AC C AG 


CTGCAAAGTT 


AAGCGCGAGA 


CCACCAAAAC 


180 


40 


AGGCGCATTC 


GAACTGGACA 


GACGCTCACG 


CAGGAGCCCA 


GCACCACAGG 


CTTGAGCCTG 


240 


ACAGCGGACG 


TGAGTGCGTG 


ACACATGGGG 


TCATCTATGG 


GCGTCGGAGC 


AAGGAAGAGA 


300 




GACGCACATG 


AACACCATGA 


TGATGCTATC 


AGGCCTGATG 


GAGGGAGCAA 


CCATGCACCT 


360 


45 


TTTCCCCTCT 


GGAAATTCAT 


AGCTCACACT 


TTTTTTTAAT 


GGAAGCAAGA 


GTTGGCAAAC 


420 




ACATGCATTT 


TCAAACAAGG 


AAAATTAATT 


CTCAAACCAC 


CATGACATGC 


AATTCTCAAA 


480 


50 


CCATGCACCG 


ACGAGTCCAT 


GCGAGGTGGA 


AACGAAGAAC 


TGAAAATCAA 


CATCCCAGTT 


540 


GTCGAGTCGA 


GAAGAGGATG 


ACACTGAAAG 


TATGCGTATT 


ACGATTTCAT 


TTACATACAT 


600 




GTACAAATAC 


ATAATGTACC 


CTACAATTTG 


TTTTTTGGAG 


CAGAGTGGTG 


TGGTCTTTTT 


660 


55 


TTTTTACACG 


AAAATGCCAT 


AGCTGGCCCG 


CATGCGTGCA 


GATCGGATGA 


TCGGTCGGAG 


720 




ACGACGGACA 


ATCAGACACT 


CACCAACTGC 


TTTTGTCTGG 


GACACAATAA 


ATGTTTTTGT 


780 


60 


AAACAAAATA 


AATACTTATA 


AACGAGGGTA 


CTAGAGGCCG 


CTAACGGCAT 


GGCCAGGTAA 


840 


ACGCGCTCCC 


AGCCGTTGGT 


TTGCGATCTC 


GTCCTCCCGC 


ACGCAGCGTC 


GCCTCCACCG 


900 
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TCCGTCCGTC GCTGCCACCT CTGCTGTGCG CGCGCACGAA GGGAGGAAGA ACGAACGCCG 960 

CACACACACT CACACACGGC ACACTCCCCG TGGGTCCCCT TTCCGGCTTG GCGTCTATCT 102 0 

5 CCTCTCCCCC GCCCATCCCC ATGCACTGCA CCGTACCCGC CAGCTTCCAC CCCCGCCGCA 1080 

CACGTTGCTC CCCCTTCTCA TCGCTTCTCA ATTAATATCT CCATCACTCG GGTTCCGCGC 1140 

TGCATTTCGG CCGGCGGGTT GAGTGAGATC TGGGCGACTG GCTGACTCAA TCACTACGCG 12 0 0 

10 

GGGATGGCGA CGTTCGCGGT GTCCGGCGCG ACTCTCGGTG TGGCGCGGGC CGGCGTCGGA 12 6 0 

GTGGCGCGGG CCGGCTCGGA GCGGAGGGGC GGGGCGGACT TGCCGTCGCT GCTCCTCAGG 13 2 0 

15 AAGAAGGACT CCTCTCGTAC GCCTCGCTCT CTCGAATCTC CCCCGTCTGG CTTTGGCTCC 13 80 

CCTTCTCTCT CCTCTGCGCG CGCATGGCCT GTTCGATGCT GTTCCCCAAT TGATCTCCAT 1440 

GAGTGAGAGA GATAGCTGGA TTAGGCGATC GCGCTTCCTG AACCTGTATT TTTTCCCCCG 1500 

20 

CGGGGAAATG CGTTAGTGTC ACCCAGGCCC TGGTGTTACC ACGGCTTTGA TCATTCCTCG 1560 

TTTCATTCTG ATATATATTT TCTCATTCTT TTTCTTCCTG TTCTTGCTGT AACTGCAAGT 162 0 

25 TGTGGCGTTT TTTCACTATT GTAGTCATCC TTGCATTTTG CAGGCGCCGT CCTGAGCCGC 1680 

GCGGCCTCTC CAGGGAAGGT CCTGGTGCCT GACGGCGAGA GGACGACTTG GCAAGTCCGG 17 40 

CGCAACCTGA AGAATTACAG GTACACACAC TCGTGCCGGT AAATC TTC AT ACAATCGTTA 1800 

30 

TTCACTTACC AAATGCCGGA TGAAACCAAC CACGGATGCG TCAGGTTTCG AGCTTCTTCT 1860 

ATCAGCATTG TGCAGTACTG CACTGCCTTG TTCATTTTGT TAGCCTTGGC CCCGTGCTGG 1920 

3 5 CTCTTGGGCC ACTGAAAAAA TCAGATGGAT GTGCATTCTA GCAAGAACTT CACAACATAA 1980 

TGCACCGTTT GGGGTTTCGT CAGTCTGCTC TACAATTGCT ATTTTTCGTG CTGTAGATAC 2 040 

CTGAAGATAT CGAGGAGCAA ACGGCGGAAG TGAACATGAC AGGGGGGACT GCAGAGAAAC 2100 

40 

TTCAATCTTC AGAACCGACT CAGGGCATTG TGGAAACAAT CACTGATGGT GTAACCAAAG 2160 

GAGTTAAGGA ACTAGTCGTG GGGGAGAAAC CGCGAGTTGT CCCAAAACCA GGAGATGGGC 2220 

45 AGAAAATATA CGAGATTGAC CCAACACTGA AAGATTTTCG GAGCCATCTT GACTACCGGT 22 80 

AATGCCTACC CGCTGCTTTC GCTCATTTTG AATTAAGGTC CTTTCATCAT GCAAATTTGG 2340 

GGAACATCAA AGAGACAAAG ACTAGGGACC ACCATTTCAT ACAGATCCCT TCGTGGTCTG 2 400 

50 

AGAATATGCT GGG AAG T AAA TGTATAATTG ATGGCTACAA TTTGCTCAAA ATTGCAATAC 2 460 

GAATAACTGT CTCCGATCAT TACAATTAAA GAGTGGCAAA CTGATGAAAA TG TGGTGG AT 2 520 

55 GGGTTATAGA TTTTAC TTTG CTAATTCCTC TACCAAATTC CTAGGGGGGA AATCTACCAG 2 580 

TTGGGAAACT TAGTTTCTTA TCTTTGTGGC CTTTTTGTTT TGGGGAAAAC ACATTGCTAA 2 640 

ATTCGAATGA TTTTGGGTAT ACCTCGGTGG ATTCAACAGA TACAGCGAAT ACAAGAGAAT 27 00 

60 

TCGTGCTGCT ATTGACCAAC ATGAAGGTGG ATTGGAAGCA TTTTCTCGTG GTTATGAAAA 27 60 

GCTTGGATTT ACCCGC AGGT AAATTTAAAG CTTTATTATT ATGAAACGCC TCCACTAGTC 2 820 

65 TAATTGCATA TCTTATAAGA AAATTTATAA TTCCTGTTTT CCCCTCTCTT TTTTCCAGTG 2 880 

CTGAAGGTAT CGTCTAATTG CATATCTTAT AAGAAAATTT ATATTCCTGT TTTCCCCTAT 2 94 0 
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TTTCCAGTGC TGAAGGTATC ACTTACCGAG AATGGGCTCC CTGGAGCGCA TGTTATGTTC 3 000 



TTTTAAGTTC CTTAACGAGA CACCTTCCAA TTTATTGTTA ATGGTCACTA TTCACCAACT 3 060 

5 

AGCTTACTGG ACTTACAAAT TAGCTTACTG AATACTGACC AGTTACTATA AATTTATGAT 312 0 



CTGGCTTTTG CACCCTGTTA CAGTCTGCAG CATTAGTAGG TGACTTCAAC AATTGGAATC 3180 
10 CAAATGCAGA TACTATG AC C AGAGTATGTC TACAGCTTGG CAATTTTCCA CCTTTGCTTC 3 240 



ATAACTACTG ATACATCTAT TTGTATTTAT TTAGCTGTTT GCACATTCCT TAAAGTTGAG 33 00 



CCTCAACTAC ATCATATCAA AATGGTATAA TTTGTCAGTG TCTTAAGCTT CAGCCCAAAG 3 3 60 

15 

ATTCTACTGA ATTTAGTCCA TC TTTTTG AG ATTGAAAATG AGTATATTAA GGATGAATGA 3 42 0 



ATACGTGCAA CACTCCCATC TGCATTATGT GTGCTTTTCC ATCTACAATG AGCATATTTC 34 80 
2 0 CATGCTATCA GTGAAGGTTT GCTCCTATTG ATGCAGATAT TTGATATGGT CTTTTCAGGA 3 540 



TGATTATGGT GTTTGGGAGA TTTTCCTCCC TAACAACGCT GATGGATCCT CAGCTATTCC 3 600 



TCATGGCTCA CGTGTAAAGG TAAGCTGGCC AATTATTTAG TCGAGGATGT AGCATTTTCG 3 6 60 

25 

AACTCTGCCT ACTAAGGGTC CCTTTTCCTC TCTGTTTTTT AG ATAC GG AT GGATACTCCA 3 720 



TCCGGTGTGA AGGATTCAAT TTCTGCTTGG ATCAAGTTCT CTGTGCAGGC TCCAGGTGAA 3780 
3 0 ATACCTTTCA ATGGCATATA TTATGATCCA CCTGAAGAGG TAAGTATCGA TC T AC ATT AC 3 840 



ATTATTAAAT GAAATTTCCA GTGTTACAGT TTTTTAATAC CCACTTCTTA CTGACATGTG 3 900 



AGTCAAGACA ATACTTTTGA ATTTGGAAGT GACATATGCA TTAATTCACC TTCTAAGGGC 3 9 60 

35 

TAAGGGGCAA CCAACCTTGG TGATGTGTGT ATGCTTGTGT GTGACATAAG ATCTTATAGC 4020 



TCTTTTATGT GTTCTCTGTT GGTTAGGATA TTCCATTTTG GCCTTTTGTG ACCATTTACT 4 080 
4 0 AAGGATATTT ACATGCAAAT GCAGGAGAAG TATGTCTTCC AACATCTCAA CTAAACGACC 414 0 



AGAGTCACTA AGGATTTATG AATCACACAT TGGAATGAGC AGCCCGGTAT GTCAATAAGT 42 00 



TATTTC AC CT GTTTCTGGTC TGATGGTTTA TTCTATGGAT TTTCTAGTTC TGTTATGTAC 42 60 

45 

TGTTAACATA TTACATGGTG CATTCACTTG ACAACCTCGA TTTTATTTTC TAATGTCTTC 432 0 



ATATTGGCAA GTGC AAAAC T TTGCTTCCTC TTTGTCTGCT TGTTCTTTTG TCTTCTGTAA 43 80 
50 GATTTCCATT GCATTTGGAG GCAGTGGGCA TGTGAAAGTC ATATCTATTT TTTTTTTGTC 44 40 



AGAGCATAGT TATATGAATT CCATTGTTGT TGCAATAGCT CGGTATAATG TAACCATGTT 4500 



ACTAGCTTAA GATTTCCCAC TTAGGATGTA AGAAATATTG CATTGGAGCG TCTCCAGCAA 4560 

55 

GCCATTTCCT ACCTTATTAA TGAGAGAGAG ACAAGGGGGG GGGGGGGGGG GGGGTTCCCT 4 620 



TCATTATTCT GCGAGCGATT CAAAAACTTC CATTGTTCTG AGGTGTACGT ACTGCAGGGA 4 6 80 
6 0 TCTCCCATTA TGAAGAGGAT ATAGTTAATT CTTTGTAACC TACTTGGAAA CTTGAGTCTT 4 7 40 



GAGGCATCGC TAATATATAC TATCATCACA ATACTTAGAG GATGCATCTG AAATTTTAGT 4800 



GTGATCTTGC ACAGGAACCG AAGATAAATT CATATGCTAA TTTTAGGGAT GAGGTGTTGC 48 60 

65 

CAAGAATTAA AAGGCTTGGA TACAATGCAG TGCAGATAAT GGCAATCCAG GAGCATTCAT 492 0 
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ACTATGCAAG CTTTGGGTAT TCACACAATC CATTTTTTTC TGTATACACT CTTCACCCAT 4 9 80 

TTGGAGCTAT TACATCCTAA TGCTTCATGC ACATAAAATA TTTGGATATA ATCCTTTATT 504 0 

5 AGATATATAG TACAACTACA CTTAGTATTC TGAAAAAGAT CATTTTATTG TTGTTGGCTT 5100 

GTTCCAGGTA CCATGTTACT AATTTTTTTG CACCAAGTAG CCGTTTTGGA ACTCCAGAGG 5160 

ACTTAAAATC CTTGATCGAT AGAGCACATG AGCTTGGTTT GCTTGTTCTT ATGGATATTG 522 0 

10 

TTCATAGGTA ATTAGTCCAA TTTAATTTTA GCTGTTTTAC TGTTTATCTG GTATTCTAAA 52 8 0 

GGGAAATTCA GGCAATTATG ATACATTGTC AAAAGCTAAG AGTGGCGAAA GTGAAATGTC 53 40 

15 AAAATCTAGA GTGGCATAAG GAAAATTGGC AAAAACTAGA GTGGCAAAAA TAAAATTTTC 54 00 

CCATCCTAAA TGGCAGGGCC CTATCGCCGA AT ATTTTTC C ATTCTATATA ATTGTGCTAC 54 60 

GTGACTTCTT TTTTCTCAGA TGTATTAAAC CAGTTGGACA TGAAATGTAT TTGGTACATG 552 0 

20 

TAGTAAACTG ACAGTTCCAT AGAATATCGT TTTGTAATGG CAACACAATT TGATGCCATA 55 8 0 

GATGTGGATT GAGAAGTTCA GATGCTATCA ATAGAATTAA TCAACTGGCC ATGTACTCGT 5640 

2 5 GGCACTACAT ATAGTTTGCA AGTTGGAAAA CTGACAGCAA TACCTCACTG ATAAGTGGCC 57 0 0 

AGGCCCCACT TGCCAGCTTC ATACTAGATG TTACTTCCCT GTTGAATTCA TTTGAACATA 57 60 

TTACTTAAAG TTCTTCATTT GTCCTAAGTC AAACTTCTTT AAGTTTGACC AAGTCTATTG 5820 

30 

GAAAATATAT CAACATCTAC AACACCAAAT TACTTTGATC AGATTAACAA TTTTTATTTT 58 80 

ATTATATTAG CACATCTTTG ATGTTGTAGA TATCAGCACA TTTTTCTATA GACTTGGTCA 59 4 0 

3 5 AATATAGAGA AGTTTGACTT AG G AC AAATC TAGAACTTCA ATCAATTTGG ATCAGAGGGA 600 0 

ACATCAAATA ATATAGATAG ATGTCAACAC TTCAACAAAA AAATC AG AC C TTGTCACCAT 606 0 

ATATGCATCA GACCATCTGT TTGCTTTAGC CACTTGCTTT CATATTTATG TGTTTGTACC 612 0 

40 

TAATCTACTT TTCCTTCTAC TTGGTTTGGT TGATTCTATT TCAGTTGCAT TGCTTCATCA 6180 

ATGATTTTGT GTACCCTGCA GTCATTCGTC AAATAATACC CTTGACGGTT TGAATGGTTT 62 40 

4 5 CGATGGCACT GATACACATT AC TTCC ACGG TGGTCCACGC GGCCATCATT GGATGTGGGA 63 00 

TTCTCGTCTA TTCAACTATG GGAGTTGGGA AGTATGTAGC TCTGACTTCT GTCACCATAT 63 6 0 

TTGGCTAACT GTTCCTGTTA ATCTGTTCTT ACACATGTTG ATATTCTATT CTTATGCAGG 642 0 

50 

TATTGAGATT CTTACTGTCA AACGCGAGAT GGTGGCTTGA AGAATATAAG TTTGATGGAT 64 8 0 

TTCGATTTGA TGGGGTGACC TCCATGATGT ATACTCACCA TGGATTACAA GTAAGTCATC 6 54 0 

55 AAGTGGTTTC AGTAACTTTT TTAGGGCACT GAAACAATTG CTATGCATCA TAACATGTAT 66 00 

CATGATCAGG ACTTGTGCTA CGGAGTCTTA GATAGTTCCC TAGTATGCTT GTACAATTTT 6 66 0 

ACCTGATGAG ATCATGGAAG ATTGGAAGTG ATTATTATTT ATTTTCTTTC TAAGTTTGTT 67 2 0 

60 

TCTTGTTCTA GATGACATTT ACTGGGAACT ATGGCGAATA TTTTGGATTT GCTACTGATG 67 8 0 

TTGATGCGGT AGTTTACTTG ATGCTGGTCA ACGATCTAAT TCATGGACTT TATCCTGATG 6 84 0 

65 CTGTATCCAT TGGTGAAGAT GTAAGTGCTT ACAGTATTTA TGATTTTTAA CTAGTTAAGT 69 00 

AGTTTTATTT TGGGGATCAG TCTGTTACAC TTTTTGTTAG GGGTAAAATC TCTCTTTTCA 69 60 
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TAACAATGCT AATTTAT AC C TTGTATGATA ATGCATCACT TAGTAATTTG AAAAGTGCAA 7 02 0 

GGGCATTCAA GCTTACGAGC ATATTTTTTG ATGGCTGTAA TTTATTTGAT AGTATGCTTG 7 080 

5 

TTTGGGTTTT TCAATAAGTG GGAGTGTGTG ACTAATGTTG TATTATTTAT TTAATTGCGG 714 0 

AAGAAATGGG CAACCTTGTC AATTGCTTCA GAAGGCTAAC TTTGATTCCA TAAACGCTTT 7 2 00 

10 GGAAATGAGA GGCTATTCCC AAGGACATGA ATT AT AC TTC AGTGTGTTCT GTACATGTAT 7 260 

TTGTAATAGT GGTTTAACTT AAATTCCTGC ACTGCTATGG AATCTCACTG TATGTTGTAG 7 3 20 

TGTACACATC CACAAACAAG TAATC CTGAG CTTTCAACTC ATGAGAAAAT AGAGTCCGCT 73 8 0 

15 

TCTGCCAGCA TTAACTGTTC ACAGTTCTAA TTTGTGTAAC TGTGAAATTG TTCAGGTCAG 744 0 

TGGAATGCCT ACATTTTGCA TCCCTGTTCC AGATGGTGGT GTTGGTTTTG ACTACCGCCT 7 500 

2 0 GCATATGGCT GTAGCAGATA AATGGATTGA ACTCCTCAAG TAAGTGCAGG AATATTGGTG 7 5 60 

ATTACATGCG CACAATGATC TAGATTACAT TTTCTAAATG GTAAAAAGGA AAATATG TAT 762 0 

GTGAATATCT AGACATTTGC CTGTTATCAG CTTGAATACG AGAAGTCAAA TACATGATTT 7 6 80 

25 

AAATAGCAAA TCTCGGAAAT GTAATGGCTA GTGTCTTTAT GCTGGGCAGT GTACATTGCG 77 4 0 

CTGTAGCAGG CCAGTCAACA CAGTTAGCAA TATTTTCAGA AACAATATTA TTTATATCCG 7 8 00 

3 0 TATATGAGAA AGTTAGTATA T AAAC TGTGG TCATTAATTG TGTTCACCTT TTGTCCTGTT 7 8 60 

TAAGGATGGG CAGTAGGTAA TAAATTTAGC CAGATAAAAT AAATCGTTAT TAGGTTTACA 7 92 0 

AAAGGAATAT ACAGGGTCAT GTAGCATATC TAGTTGTAAT TAATGAAAAG GCTGACAAAA 7 9 80 

35 

GGCTCGGTAA AAAAAACTTT ATGATGATCC AGATAGATAT GCAGGAACGC GACTAAAGCT 8040 

CAAATACTTA TTGCTACTAC ACAGCTGCCA ATCTGTCATG ATCTGTGTTC TGCTTTGTGC 8100 

40 TATTTAGATT TAAATACTAA CTCGATACAT TGGCAATAAT AAACTTAACT ATTCAACCAA 8160 

TTTGGTGGAT ACCAGAATTT CTGCCCTCTT GTTAGTAATG ATGTGCTCCC TGCTGCTGTT 822 0 

CTCTGCCGTT ACAAAAGCTG TTTTCAGTTT TTTGCATCAT TATTTTTGTG TGTGAGTAGT 82 80 

45 

TTAAGCATGT TTTTTGAAGC TGTGAGCTGT TGGTACTTAA TACATTCTTG GAAGTGTCCA 83 40 

AATATGCTGC AGTGTAATTT AGCATTTCTT TAACACAGGC AAAGTGACGA ATCTTGGAAA 8400 

50 ATGGGCGATA TTGTGCACAC CCTAACAAAT AGAAGGTGGC TTGAGAAGTG TGTAAC TT AT 8 4 60 

GCAGAAAGTC ATGATCAAGC ACTAGTTGGT G AC AAG AC T A TTGCATTCTG GTTGATGGAT 852 0 

AAGGTACTAG CTGTTACTTT TGGACAAAAG AATTACTCCC TCCCGTTCCT AAATATAAGT 8 580 

55 

CTTTGTAGAG ATTCCACTAT GGACCACATA G TAT AT AG AT GCATTTTAGA GTGTAGATTC 8640 

ACTCATTTTG CTTCGTATGT AGTCCATAGT GAAATCTCTA CAGAGACTTA TATTTAGGAA 87 00 

60 CGGAGGGAGT ACATAATTGA TTTGTCTCAT C AG ATTGC T A GTGTTTTCTT GTGATAAAGA 8760 

TTGGCTGCCT CACCCATCAC CAGCTATTTC CCAACTGTTA CTTGAGCAGA ATTTGC TGAA 8 82 0 

AACGTACCAT GTGGTACTGT GGCGGCTTGT GAACTTTGAC AGTTATGTTG CAATTTTCTG 8 8 80 

TTCTTATTTA TTTGATTGCT TATGTTACCG TTCATTTGCT CATTCCTTTC CGAGACCAGC 894 0 



65 
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CAAAGTCACG TGTTAGCTGT GTGATCTGTT ATCTGAATCT TGAGCAAATT TTATTAATAG 9000 

GCTAAAATCC AACGAATTAT TTGCTTGAAT TTAAATATAC AGACGTATAG TCACCTGGCT 9060 

5 CTTTCTTAGA TG ATT AC C AT AGTGCCTGAA GGCTGAAATA GTTTTGGTGT TTCTTGGATG 912 0 

CCGCCTAAAG GAGTGATTTT TATTGGATAG ATTCCTGGCC GAGTCTTCGT TACAACATAA 9180 

CATTTTGGAG ATATGCTTAG TAACAGCTCT GGGAAGTTTG GTCACAAGTC TGCATCTACA 9 240 

10 

CGCTCCTTGA GGTTTTATTA TGGCGCCATC TTTGTAACTA GTGGCACCTG TAAGGAAACA 93 00 

CATTCAAAAG GAAACGGTCA CATCATTCTA ATCAGGACCA C CAT AC T AAG AGCAAGATTC 93 6 0 

15 TGTTCCAATT TTATGAGTTT TTGGGACTCC AAAGGGAACA AAAGTGTCTC ATATTGTGCT 9420 

TATAACTACA GTTGTTTTTA TACCAGTGTA GTTTTATTCC AGGACAGTTG ATACTTGGTA 9480 

CTGTGCTGTA AATTATTTAT CCGACATAGA ACAGCATGAA CATATCAAGC TCTCTTTGTG 9540 

20 

CAGGATATGT ATGATTTCAT GGCTCTGGAT AGGCTTCAAC TCTTCGCATT GATCGTGGCA 9 600 

TAGCATTACA TAAAATGATC AGGCTTGTCA CCATGGGTTT AGGTGGTGAA GGCTATCTTA 9 660 

2 5 ACTTCATGGG AAATGAGTTT GGGCATCCTG GTCAGTCTTT ACAACATTAT TGCATTCTGC 972 0 

ATGATTGTGA TTTACTGTAA TTTGAACCAT GCTTTTCTTT CACATTGTAT GTATTATGTA 97 80 

ATCTGTTGCT TCCAAGGAGG AAGTTAACTT CTATTTACTT GGCAGAATGG ATAGATTTTC 9840 

30 

CAAGAGGCCC ACAAACTCTT CCAACCGGCA AAGTTCTCCC CTGGAAATAA CAATAGTTAT 9 900 

GATAAATGCC GCCGTAGATT TGATCTTGTA AGTTTTAGCT GTGCTATTAC ATTCCCTCAC 996 0 

3 5 TAGATCTTTA TTGGCCATTT ATTTCTTGAT GAAATCATAA TGTTTGTTAG GAAAGATCAA 10020 

CATTGCTTTT GTAGTTTTGT AGACGTTAAC ATAAGTATGT GTTGAGAGTT GTTGATCATT 10080 

AAAAATATCA TGATTTTTTG CAGGGAGATG CAGATTTTCT TAGATATCGT GGTATGCAAG 10140 

40 

AGTTCGATCA GGCAATGCAG CATCTTGAGG AAAAATATGG GGTATGTCAC TGGTTTGTCT 102 00 

TTGTTGCATA ACAAGTCACA GTTTAACGTC AG TCTCTTC A AGTGGTAAAA AAAGTGTAGA 102 60 

4 5 ATTAATTCCT GTAATGAGAT GAAAACTGTG CAAAGGCGGA GCTGGAATTG CTTTTCACCA 10320 

AAAC TATTTT CTTAAGTGCT TGTGTATTGA TACATATACC AGCACTGACA ATGTAACTGC 10380 

AGTTTATGAC ATCTGAGCAC CAGTATGTTT CACGGAAACA TGAGGAAGAT AAGGTGATCA 10440 

50 

TCCTCAAAAG AGGAGATTTG GTATTTGTTT TCAACTTCCA CTGGAGCAAT AGCTTTTTTG 10500 

ACTACCGTGT TGGGTGTTCC AAGCCTGGGA AGTACAAGGT ATGCTTGCCT TTTCATTGTC 10560 

55 CACCCTTCAC CAGTAGGGTT AGTGGGGGCT TCTACAACTT TTAATTCCAC ATGGATAGAG 10620 

TTTG TTGGTC GTGCAGCTAT CAATATAAAG AATAGGGTAA TTTGTAAAGA AAAGAATTTG 10680 

CTCGAGCTGT TGTAGCCATA GGAAGGTTGT TCTTAACAGC CCCGAAGCAC ATACCATTCA 10740 

60 

TTCATATTAT CT AC TT AAG T GTTTGTTTCA ATCTTTATGC TCAGTTGGAC TCGGTCTAAT 10800 

ACTAGAACTA TTTTCCGAAT CTACCCTAAC CATCCTAGCA GTTTTAGAGC AGCCCCATTT 10860 

65 GGACAATTGG CTGGGTTTTT GTTAGTTGTG ACAGTTTCTG CTATTTCTTA ATCAGGTGGC 10920 

CTTGGACTCT GACGATGCAC TCTTTGGTGG ATTCAGCAGG CTTGATCATG ATGTCGACTA 109 80 
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CTTCACAACC 


GTAAGTCTGG 


GCTCAAGCGT 


CACTTGACTC 


GTCTTG AC TC 


AACTGCTTAC 


11040 




AAATCTGAAT 


CAACTTCCCA 


ATTGCTGATG 


CCCTTGCAGG 


AACATCCGCA 


TGACAACAGG 


11100 


5 


CCGCGCTCTT 


TCTCGGTGTA 


CACTCCGAGC 


AGAACTGCGG 


TCGTGTATGC 


CCTTACAGAG 


11160 




TAAGAACCAG 


CAGCGGCTTG 


TTACAAGGCA 


AAGAGAGAAC 


TCCAGAGAGC 


TCGTGGATCG 


11220 


10 


TGAGCGAAGC 


GACGGGCAAC 


GGCGCGAGGC 


TGCTCCAAGC 


GCCATGACTG 


GGAGGGGATC 


11280 


GTGCCTCTTC 


CCCAGATGCC 


AGGAGGAGCA 


GATGGATAGG 


TAGCTTGTTG 


GTGAGCGCTC 


11340 




GAAAGAAAAT 


GGACGGGCCT 


GGGTGTTTGT 


TGTGCTGCAC 


TGAACCCTCC 


TCCTATCTTG 


11400 


15 


CACATTCCCG 


GTTGTTTTTG 


TACATATAAC 


TAATAATTGC 


CCGTGCGCTC 


AACGTGAAAA 


11460 



TCC 11463 



(2) INFORMATION FOR SEQ ID NO: 1 1 : 

2 0 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2662 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

3 0 (iv) ANTI-SENSE: 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

35 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1.. 2651 

(D) OTHER INFORMATION:/product= "nucleotide sequence of 

4 0 cDNA wheat SSS I" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



45 


TCTCCCACTC 


TTCTCTCCCC 


GCGCACACCG 


AGTCGGCACC 


GGCTCATCAC 


CCATCACCTC 


60 


GGCCTCGGCC 


ACCGGCAAAC 


CCCCCGATCC 


GCTTTTGCAG 


GCAGCGCACT 


AAAACCCCGG 


120 




GGAGCGCGCC 


CCGCGGCAGC 


AGCAGCACCG 


CAGTGGGAGA 


GAGAGGCTTC 


GCCCCGGCCC 


180 


50 


GCACCGAGCG 


GGGCGATCCA 


CCGTCCGTGC 


GTCCGCACCT 


CCTCCGCCTC 


CTCCCCTGTC 


240 




CCGCGCGCCC 


ACACCCATGG 


CGGCGACGGG 


CGTCGGCGCC 


GGGTGCCTCG 


CCCCCAGCGT 


300 


55 


CCGCCTGCGC 


GCCGATCCGG 


CGACGGCGGC 


CCGGGCGTCC 


GCCTGCGTCG 


TCCGCGCGCG 


360 


GCTCCGGCGC 


TTGGCGCGGG 


GCCGCTACGT 


TGCCGAGCTC 


AGCAGGGAGG 


GCCCCGCGGC 


420 




GCGCCCCGCG 


CAGCAGCAGC 


AACTGGCCCC 


GCCGCTCGTG 


CCAGGCTTCC 


TCGCGCCGCC 


480 


60 


GCCGCCCGCG 


CCCGCCCAGT 


CGCCGGCCCC 


GACGCAGCCG 


CCCCTGCCGG 


ACGCCGGCGT 


540 




GGGGGAACTC 


GCGCCCGACC 


TCCTGCTCGA 


AGGGATTGCT 


GAGGATTCCA 


TCGACAGCAT 


600 
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AATTGTGGCT GCAAGTGAGC AGGATTCTGA GATCATGGAT GCGAATGAGC AACCTCAAGC 660 

TAAAGTTACA CGTAGCATCG TGTTTGTGAC TGGTGAAGCT GCTCCTTATG CAAAGTCAGG 72 0 

5 

GGGGCTGGGA GATGTTTGTG GTTCGTTACC AATTGCTCTT GCTGCTCGTG GTCACCGTGT 78 0 

GATGGTTGTA ATGCCAAGAT ACTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT 84 0 

10 ATACACTGGG AAGCACATTA AGATTCCATG CTTTGGGGGA TCACATGAAG TGACCTTTTT 900 

TCATGAGTAT AG AG AC AAC G TCGATTGGGT GTTTGTCGAT CATCCGTCAT ATCATAGACC 9 60 

AGGAAGTTTA TATGGAGATA ATTTTGGTGC TTTTGGTGAT AATCAGTTCA GATACACACT 102 0 

15 

CCTTTGCTAT GCTGCATGCG AGGCCCCACT AATCCTTGAA TTGGGAGGAT ATATTTATGG 1080 

ACAGAATTGC ATGTTTGTTG TGAACGATTG GCATGCCAGC CTTGTGCCAG TCCTTCTTGC 1140 

2 0 TGCAAAATAT AGACCATACG GTGTTTACAG AGATTCCCGC AGCACCCTTG TTATACATAA 12 00 

TTTAGCACAT CAGGGTCTGG AGCCTGCAAG TACATATCCT GATCTGGGAT TGCCACCTGA 12 60 

ATGGTATGGA GCTTTAGAAT GGGTATTTCC AGAATGGGCA AGGAGGCATG CCCTTGACAA 13 2 0 

25 

GGGTGAGGCA GTTAACTTTT TGAAAGGAGC AGTCGTGACA GCAGATCGAA TTGTGACCGT 13 80 

CAGTCAGGGT TATTCATGGG AGGTCACAAC TGCTGAAGGT GGACAGGGCC TCAATGAGCT 1440 

3 0 CTTAAGCTCC CGAAAAAGTG TATTGAATGG AATTGTAAAT GGAATTGACA TTAATGATTG 150 0 

GAACCCCACC ACAGACAAGT GTCTCCCTCA TCATTATTCT GTCGATGACC TCTCTGGAAA 156 0 

GGCCAAATGT AAAGCTGAAT TGCAGAAGGA GCTGGGTTTA CCTGTAAGGG AGGATGTTCC 162 0 

35 

TCTGATTGGC TTTATTGGAA GACTGGATTA CCAGAAAGGC ATTGATCTCA TTAAAATGGC 1680 

CATTCCAGAG CTCATGAGGG AGGACGTGCA GTTTGTCATG CTTGGATCTG GGGATCCAAT 174 0 

40 TTTTGAAGGC TGGATGAGAT CTACCGAGTC GAGTTACAAG GATAAATTCC GTGGATGGGT 1800 

TGGATTTAGT GTTCCAGTTT CCCACAGAAT AACTGCAGGT TGCGATATAT TGTTAATGCC 18 60 

ATCCAGGTTT GAACCTTGTG GTCTTAATCA GCTATATGCT ATGCAATATG GTACAGTTCC 1920 

45 

TGTAGTTCAT GGAACTGGGG GCCTCCGAGA CACAGTCGAG ACCTTCAACC CTTTTGGTGC 19 80 

AAAAGGAGAG GAGGGTACAG GGTGGGCGTT CTCACCGCTA ACCGTGGACA AGATGTTGTG 2 04 0 

50 GGCATTGCGA ACCGCGATGT CGACATTCAG GGAGCACAAG CCGTCCTGGG AGGGGCTCAT 2100 

GAAGCGAGGC ATGACGAAAG ACCATACGTG GGACCATGCC GCCGAGCAGT ACGAGCAGAT 216 0 

CTTCGAATGG GCCTTCGTGG ACCAACCCTA CGTCATGTAG ACGGGGACTG GGGAGGTCGA 2 22 0 

55 

AGCGCGGGTC TCCTTGAGCT CTGAAGACAT GTTCCTCATC CTTCCGCGGC CCGGAAGGAT 22 8 0 

ACCCCTGTAC ATTGCGTTGT CCTGCTACAG TAGAGTCGCA ATGCGCCTGC TTGCTTGGTC 23 40 

60 CGCCGGTTCG AGAGTAGATG ACGGCTGTGC TGCTGCGGCG GTGACAGCTT CGGGTGGATG 2 4 00 
ACAGTTACAG TTTTGGGGAA TAAGGAAGGG ATGTGCTGCA GGATGGTTAA CAGCAAAGCA 2460 

CCACTCAGAT GGCAGCCTCT CTGTCCGTGT TACAGCTGAA ATCAGAAACC AACTGGTGAC 2 52 0 

TCTTTAGCCT TAGCGATTGT GAAGTTTGTT GCATTCTGTG TATGTTGTCT TGTCCTTAGC 25 8 0 
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TGACAAATAT TAGACCTGTT GGAGAATTTT ATTTATCTTT GCTGCTGTTG TTTTTGTTTT 2 64 
GTTAAAAAAA AAAAAAAAAA AA 2662 

5 (2) INFORMATION FOR SEQ ID NO: 12: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 768 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
1 0 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

15 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 



(ix) FEATURE: 
2 0 (A) NAME/KEY: Protein 
(B) LOCATION: 1.. 768 



(ix) FEATURE: 
(A) NAME/KEY: Protein 
2 5 (B) LOCATION: L.768 

(D) OTHER INFORMATION :/product= "deduced amino acid 
sequence SBE II" 



30 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Ala Thr Phe Ala Val Ser Gly Ala Thr Leu Gly Val Ala Arg Pro 
15 10 15 



Pro Ala Ala Ala Gin Pro Glu Glu Leu Gin lie Pro Glu Asp lie Glu 
35 20 25 30 

Glu Gin Thr Ala Glu Val Asn Met Thr Gly Gly Thr Ala Glu Lys Leu 

35 40 45 

4 0 Glu Ser Ser Glu Pro Thr Gin Gly lie Val Glu Thr lie Thr Asp Gly 

50 55 60 



Val Thr Lys Gly Val Lys Glu Leu Val Val Gly Glu Lys Pro Arg Val 
65 70 75 80 

Val Pro Lys Pro Gly Asp Gly Gin Lys lie Tyr Glu lie Asp Pro Thr 
85 90 95 



Leu Lys Asp Phe Arg Ser His Leu Asp Tyr Arg Tyr Ser Glu Tyr Arg 
50 100 105 110 

Arg lie Arg Ala Ala lie Asp Gin His Glu Gly Gly Leu Glu Ala Phe 

115 120 125 

55 Ser Arg Gly Tyr Glu Lys Leu Gly Phe Thr Arg Ser Ala Glu Gly lie 

130 135 140 



60 



Thr Tyr Arg Glu Trp Ala Pro Gly Ala His Ser Ala Ala Leu Val Gly 
145 150 155 160 
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30 



40 



45 
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Asp Phe Asn Asn Trp Asn Pro Asn Ala Asp Thr Met Thr Arg Asp Asp 
165 170 175 

Tyr Gly Val Trp Glu lie Phe Leu Pro Asn Asn Ala Asp Gly Ser Pro 
180 185 190 

Ala lie Pro His Gly Ser Arg Val Lys lie Arg Met Asp Thr Pro Ser 
195 200 205 

Gly Val Lys Asp Ser lie Ser Ala Trp lie Lys Phe Ser Val Gin Ala 
210 215 220 

Pro Gly Glu lie Pro Phe Asn Gly lie Tyr Tyr Asp Pro Pro Glu Glu 
225 230 235 240 

Glu Lys Tyr Val Phe Gin His Pro Gin Pro Lys Arg Pro Glu Ser Leu 
245 250 255 



Arg lie Tyr Glu Ser His lie Gly Met Ser. Ser Pro Glu Pro Lys lie 

20 260 265 270 

Asn Ser Tyr Ala Asn Phe Arg Asp Glu Val Leu Pro Arg lie Lys Arg 

275 280 285 

25 Leu Gly Tyr Asn Ala Val Gin lie Met Ala lie Gin Glu His Ser Tyr 

290 295 300 



Tyr Ala Ser Phe Gly Tyr His Val Thr Asn Phe Phe Ala Pro Ser Ser 

305 310 315 320 

Arg Phe Gly Thr Pro Glu Asp Leu Lys Ser Leu lie Asp Arg Ala His 

325 330 335 



Glu Leu Gly Leu Leu Val Leu Met Asp lie Val His Ser His Ser Ser 
35 340 345 350 



Asn Asn Thr Leu Asp Gly Leu Asn Gly Phe Asp Gly Thr Asp Thr His 
355 360 365 

Tyr Phe His Gly Gly Pro Arg Gly His His Trp Met Trp Asp Ser Arg 
370 375 380 

Leu Phe Asn Tyr Gly Ser Trp Glu Val Leu Arg Phe Leu Leu Ser Asn 
385 390 395 400 

Ala Arg Trp Trp Leu Glu Glu Tyr Lys Phe Asp Gly Phe Arg Phe Asp 
405 410 415 



G1 Y Val Thr Ser Met Met Tyr Thr His His Gly Leu Gin Met Thr Phe 
50 420 425 430 

Thr Gly Asn Tyr Gly Glu Tyr Phe Gly Phe Ala Thr Asp Val Asp Ala 
435 440 445 

55 Val Val Tyr Leu Met Leu Val Asn Asp Leu lie His Gly Leu His Pro 

450 455 460 



Asp Ala Val Ser lie Gly Glu Asp Val Ser Gly Met Pro Thr Phe Cys 

465 470 475 480 

He Pro Val Pro Asp Gly Gly Val Gly Phe Asp Tyr Arg Leu His Met 
485 490 495 
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Ala Val Ala Asp Lys Trp He Glu Leu Leu Lys Gin Ser Asp Glu Ser 
500 505 510 

Trp Lys Met Gly Asp He Val His Thr Leu Thr Asn Arg Arg Trp Leu 
5 515 520 525 

Glu Lys Cys Val Thr Tyr Ala Glu Ser His Asp Gin Ala Leu Val Gly 
530 535 540 

10 Asp Lys Thr He Ala Phe Trp Leu Met Asp Lys Asp Met Tyr Asp Phe 

545 550 555 560 



15 



30 



45 



60 



Met Ala Leu Asp Arg Pro Ser Thr Pro Arg He Asp Arg Gly He Ala 
565 570 575 

Leu His Lys Met He Arg Leu Val Thr Met Gly Leu Gly Gly Glu Gly 
580 585 590 



Tyr Leu Asn Phe Met Gly Asn Glu Phe Gly His Pro Glu Trp He Asp 
20 595 600 605 

Phe Pro Arg Gly Pro Gin Thr Leu Pro Thr Gly Lys Val Leu Pro Gly 
610 615 620 

2 5 Asn Asn Asn Ser Tyr Asp Lys Cys Arg Arg Arg Phe Asp Leu Gly Asp 

^25 630 635 640 



Ala Asp Phe Leu Arg Tyr His Gly Met Gin Glu Phe Asp Gin Ala Met 
645 650 655 

Gin His Leu Glu Glu Lys Tyr Gly Phe Met Thr Ser Glu His Gin Tyr 
660 665 670 



Val Ser Arg Lys His Glu Glu Asp Lys Val He He Phe Glu Arg Gly 
35 675 680 685 

Asp Leu Val Phe Val Phe Asn Phe His Trp Ser Asn Ser Phe Phe Asp 
690 695 700 

40 Tyr Arg Val Gly Cys Ser Arg Pro Gly Lys Tyr Lys Val Ala Leu Asp 

705 710 715 720 



Ser Asp Asp Ala Leu Phe Gly Gly Phe Ser Arg Leu Asp His Asp Val 
725 730 735 

Asp Tyr Phe Thr Thr Glu His Pro His Asp Asn Arg Pro Arg Ser Phe 
740 745 750 



Ser Val Tyr Thr Pro Ser Arg Thr Ala Val Val Tyr Ala Leu Thr Glu 
50 755 760 765 

(2) INFORMATION FOR SEQ ID NO: 13: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 10550 base pairs 
55 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: triticum tauschii 

(ix) FEATURE: 
5 (A) NAME/KEY: exon 

(B) LOCATION: 1.3 16 

(D) OTHER INFORMATION:/product= "exon 1 " 

(ix) FEATURE: 
1 0 (A) NAME/KEY : exon 

(B) LOCATION: 1 472.. 1 828 

(D) OTHER INFORMATION:/product= "exon 2" 

(ix) FEATURE: 
1 5 (A) NAME/KEY: exon 

(B) LOCATION:2766..2823 

(D) OTHER INFORMATION:/product= "exon 3" 

(ix) FEATURE: 
2 0 (A) NAME/KEY: exon 

(B) LOCATION:2906..3028 

(D) OTHER INFORM ATION:/product= "exon 4" 

(ix) FEATURE: 

2 5 (A) NAME/KEY : exon 

(B) LOCATIONS 1 1 3 . .4 1 94 

(D) OTHER INFORMATION:/product= "exon 5" 

(ix) FEATURE: 

3 0 (A) NAME/KEY: exon 

(B) LOCATION:4286„4459 

(D) OTHER INFORMATION:/product= "exon 6" 

(ix) FEATURE: 

3 5 (A) NAME/KEY : exon 

(B) LOCATION:4562..4643 

(D) OTHER INFORMATION:/product= "exon 7" 

(ix) FEATURE: 

4 0 (A) NAME/KEY : exon 

(B) LOCATION:4744„4855 

(D) OTHER INFORM ATION:/product= "exon 8" 

(ix) FEATURE: 

4 5 (A) NAME/KEY: exon 

(B) LOCATION:4999..5021 

(D) OTHER INFORMATION:/product= "exon 9" 

(ix) FEATURE: 

5 0 (A) NAME/KEY : exon 

(B) LOCATIONS 102..5 192 

(D) OTHER INFORM ATION:/product= "exon 10" 

(ix) FEATURE: 
5 5 (A) NAME/KEY: exon 

(B) LOCATION:8593..8718 
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(D) OTHER INFORM ATION:/product= "exon 11" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
5 (B) LOCATION:8807..8915 

(D) OTHER INFORMATION :/product= "exon 12" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
10 (B) LOCATION:8992..9104 

(D) OTHER INFORMATION:/product= "exon 13" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
15 (B) LOCATIONS 161. .9 199 

(D) OTHER INFORMATION :/product= "exon 14" 

(ix) FEATURE: 
(A) NAME/KEY: exon 
2 0 (B) LOCATION:9498..97 1 3 

(D) OTHER INFORMATION :/product= "exon 15" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

2 5 ATGGCGGCGA CGGGCGTCGG CGCCGGGTGC CTCGCCCCCA GCGTCCGCCT 50 

GCGCGCCGAT CCGGCGACGG CGGCCCGGGC GTCCGCTTGC GTCGTCCGCG 100 

CGCGGCTCCG GCGCTTGGCG CGGGGCCGCT ACGTCGCCG A GCTCAGCAGG 1 50 

30 

GAGGGCCCCG CGGCGCGCCC CGCGCAGCAG CAGCAACTGG CCCCGCCGCT 200 

CGTGCCAGGC TTCCTCGCGC CGCCGCCGCC CGCGCCCGCC CAGTCGCCGG 250 

3 5 CCCCGACGCA GCCGCCCCTG CCGGACGCCG GCGTGGGGGA ACTCGCGCCC 300 

GACCTCCTGC TCGAAGGTAA AAAACAAGGC TGAATCCTCA GATCACTCCG 350 

CGTCTTCGTT TTACCAAATA CGGTACTGCG AAGTGGTGCT GTATATGTGA 400 

40 

AGTTTCTGTC GATTTCTTCC TG ACGGATGT TCAGTCGATT CAGTTGTATA 450 

TATGTGATAC GTTCGTTGTT CATCGATCGT ACAGATTTAC CAGCACACTA 500 

4 5 GATAG AAATC GAGACCGACG CGGGCAGATC AATAGATTTT TCTAGACGTT 550 

TTATTGG ATC GTGAG ATGAT TGATTGGGGT GGCGTGTCG A TACGATAGCG 600 

GTGCACCGCC GATGTATCGG GGCATGTGCA CGTGGTTGGG TCTCAGCAGA 650 

50 

CATATCACTA GACTGGTATC GTAATTTACT AGTACTACTG GAAAGAGGAC 700 

TAAAAAGGCT AGGCCAAGTG CACGCATGTT GGGAACGTTG TTAAATTGAT 750 

5 5 G AGTTTGTCC TTTGCTTGGG CTGGTATTAT TACC AAAAAA TGGTGTTAGT 800 
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CCCTGTACTT ATTAATGGGA AAATCTTAAC ATGACACTGG GGTTTATGAG 850 

TCTCCA ATTG TATATTCTCA GCACTC AACT GATTTTACTG ATACTGTAGT 900 

5 GGAAATGACA CGTGAGCACC CCCCTTCAAG GAATGCAATG CTTCTTTCTG 950 

TTTTATATTA CAGGAACTAG AAGGAGCTTC CACCTTTGAG TACAGAAGTA 1000 

CTCCCTCCGT TCC A AAATAG ATG ACTC A AC TTTGTACTAA TTTTGTACTA 1 050 

10 

T AGTTAGTAC A A AGTTG AGT CATCTATTTT AG A ACGG AG G G AGT AGTATC 1 1 00 

G AA ATTG A AG ACCCTTGTAT TACTGTCTTG TTTTTC A ATG AAAATGGGAG 1 1 50 

1 5 GCCC ATGCAG TAAGTCACAT GGGC ACCTGG G AGGCTGGGA TC ATGTGTGC 1200 

TTTGC AG AGT ACTAG ACCC A GCTC ACCCTC TGTTAG ATT A CTTGTTGGGC 1 250 

TGCTA CTTTG TGTTTGCTGT GCAGTATATC AG ACATCCTG A ATTTGGC AT 1 300 

20 

CTAGCTG AGA ACAGA ATGCA GGTTGCACCA TTCTTATTAT TGCTAAACTG 1 350 

TTGTC ACGC A ATTTATA A AG A ATGTG ATCT TCTG AGTATT A ATTA ATC AT 1 400 

2 5 GTTCTGCTA A TATCTGTCCT CGCTCTGGTG TTG AC A A ATA TACC ATATG A 1 450 

ATATTTTCCA TTTTGCAACC AGGGATTGCT GAGGATTCCA TCGACAGCAT 1500 

AATCGTGGCT GCAAGTGAGC AGGATTCTGA GATCATGGAT GCGAATGAGC 1550 

30 

AACCTC AAGC TA AAGTTACA CGTAGCATCG TGTTTGTGAC TGGTGA AGCT 1 600 

GCTCCTTATG CAAAGTCAGG GGGGCTGGGA GATGTTTGTG GTTCGTTACC 1650 

3 5 A ATTGCTCTT GCTGCTCGTG GTC ACCGTGT G ATGGTTGTA ATGCC A AG AT 1 700 

ACTTGAATGG GTCCTCTGAT AAAAACTATG CAAAGGCATT ATACACTGCG 1750 

AAGC AC ATTA AG ATTCC ATG CTTTGGGGG A TCAC ATGAAG TG ACCTTTTT 1 800 

40 

TCATG AGTAT AGAG AC A ACG TCG ATTGGGT GGGTACAC AA TCACCTTCTT 1 850 

ATTCTCTGTT G A ATTGTAGC A ACTGTTTAT CCTTGTTTAC ACTTCTTTTA 1 900 

4 5 GCCCTGC A A A G AC AT ATGTG ATTTCC ATAC TTTTTTGTT A TTTCCCTTGT 1 950 

ACTCTTGCTC ATGAAGGTCA AAATATCATA TATCCATGGA AGTCATGCAT 2000 

GTGCCTAGTA TTTTTGGTGT CGGTGCCTTT AACTTTCAGG GATTAATACG 2050 

50 

TGG A ATTTG A TAACTA AAGT TTATTTTATT G AAAAAA ATT GTAGGTTGG 2 1 00 

TG AGCCCAC A GCCACGC AGT GGCACCACTG CTTGCAC ATG ATTTTGCATT 2 1 50 

5 5 TCTGTTTGCA CCGAGCACTT CATGTGAATA AGGTGTAAA A TCATAAAGTA 2200 
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CCA ATTTTAT TCTGCCAATT GCACTTAAGA GT AT AT AC AT TTATCTTGG C 2250 

CTCAATCATG GGAGTACTGT GCATTCAGTG CACCATCATT GTTCTAAGGA 2300 

5 GAAAATGTGG GTGCAAGGAA GACACTTTTG TCCCTTAATA AAAGGCAGGC 2350 

ACTCTGTTGT CATATAGATA GAAAGCAACA AACTTATTTC AAAGAGCTAA 2400 

CAATGGCAAA AGAACCAAAA AAAGCATGCT AAGGCGGTGA CACCAAAAGG 2450 

10 

TGAGGGGGGC CTTGTGACTG ACAGCACCCC AAACTATTGC CATTGTTTTA 2500 

CTAAATGAAG ATCATTTTAG AAGCTCTCAG GAACTTCGAA AACAGTGGCT 2550 

1 5 TTCCGTCCAC AGATCGTCTG TTA ATATTTT TGTCC AGTG A TACTTTTTTT 2600 

GCTCCTTACA AGAGTGCCTA TGTTGAC ATA TACATTGTTA AGTTGTTCAT 2650 

AAGTTTACTT CTTATTCTAA ACAGCAAGTG CCTAATGCTT GCATTTATTT 2700 

20 

TGGCTATTTA TTTTTATTCT CATTTCAATC AACACTTTTG TTCAGGTGTT 2750 

TGTCGATCAT CCGTCATATC ATAGACCAGG AAGTTTATAT GG AGATAATT 2800 

2 5 TTGGTGCTTT TGGTGATAAT CAGGTACACT ACACTATACT AAGCTCCTAG 2850 

TTG ACTAAGT CGTA AGTTGT ACCTCCTCGC TG ACCGGCTG CTCTATGTCG 2900 

TGCAGTTCAG ATACACACTC CTTTGCTATG CTGCATGCGA GGCCCCACTA 2950 

30 

ATCCTTGAAT TGGGAGGATA TATTTATGGA CAG AATTGCA TGTTTGTTGT 3000 

GAACGATTGG CATGCCAGCC TTGTGCCAGT GTACGTTGTT TGTGGATCTG 3050 

3 5 AAAGTCC A AT CCTTTATTCA TTCTCTGCTT TGCAGTGTGC CCATGTCTAC 3 1 00 

ATTTCTTTTA TGCTTTTTTC ATGTCTGTTC TTATATTGCA TATATGCTTA 3 1 50 

TGG AGTCTAA A AGTTACCGG AGGG AATAAC TCTTAAGGAT TTCCTCAATC 3200 

40 

AATTATCTTT AGCTTTAGTT AACATTTACT GTGGCAAACA TAATGTGTTT 3250 

TGAGATTTAC AAGTTCAGAG ATTGCACTTC ACTAGTTCGT AGCTAATCTG 3300 

4 5 ATGTTTTCCC CGAGAAAATG CCTAAAGCTT TGTGTCTTG A TGCATTGATA 3350 

GAAAA AGAGT TTATGTACAC TCCCAAAGAG GGGACCCAAA ATTACA ACAC 3400 

CACACCCCTG AGAACTAGGC GCTGCCGGAA GAAGCGATGC AAGCCCCACT 3450 

50 

GCCCCTGCCT TAGCTCAAAG CCGGGCGTCA GCTTGATTGT GTCAAGTAAG 3500 

CTAGCAGTGC TAGATTGCGC AAGGTCGATT CGTCGAAGAT GACAGTGTTG 3550 

5 5 CGCTGCTTCC AAATCCACCA AACTATG AGC ATGATCACTG G AGAAGTACC 3600 
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TTTTCTCGCG GCTGAGGGGG TGG ACTGGTG GTCTGCTGCT GCCAGTTTTC 3650 

AGATAATCTG AAAAATGCAT GTTTTG ATGA TTTTAGTATC TTGCGGACCC 3700 

5 TGGGTACCAC CTAAGCTTTC ACACAGTAAT TTGCAGTTAC ACCTATAAAA 3750 

GTAACGGTCA TGATATGCAT GTGTTTTGGG TAGATCATGG TGCATGCATT 3800 

TTAGGAATTA GGACATGCCA GAACCACGTG AGGCTTATGG GGCAATTCAT 3850 

10 

TTGTTCCATT ATACG AGTCA TGAATATGGT TCAGCATGTT TGG ACGCTAC 3900 

TTGTTTGGGG CAATTTCAGA TGGTGAATTG TAGCTGCTTG ATGTTGGCTA 3950 

1 5 GCTGGCTTAT TTTGTACAAG TATCG ATGTT AGATGCATAT TTCCTTTTGT 4000 

TCTTGTGCTG TTTGCCATGT TGTATTCCCC TTTTCTGTCG CCAGTGTTGC 4050 

ATGTT AA ATT GGTTTTC ATT AC ATAATC A A CTTTGTTGCT G ACATC AGTC 4 1 00 

20 

ATTTTTATTC AGCCTTCTTG CTGC A A A ATA TAG ACC ATAC GGTGTTTAC A 4 1 50 

GAGATTCCCG CAGCACCCTT GTTATACATA ATTTAGCACA TCAGGTTTGG 4200 

2 5 GTCTATCACC TTTCATTATC CGTACATGGC TTTGTAAGTC GGTTCACACG 4250 

TATCGTCATA CTGTATGTTA TTTCAATGTC ATTAGGGTGT GGAGCCTGCA 4300 

AGTACATATC CTG ATCTGGG ATTGCCACCT GAATGGTATG GAGCTTTAGA 4350 

30 

ATGGGTATTT CCAG AATGGG CAAGGAGGCA TGCCCTTGAC AAGGGTGAGG 4400 

CAGTTAACTT TTTGAAAGGA GCAGTTGTG A CAGCAGATCG AATTGTGACC 4450 

3 5 GTCAGTCAGG TG AAATACTC AATACTTCTC TTTTTTCTTT GCGGGATGTT 4500 

CTTCAGTTCA ATTGCCCTGT CTTTCACCCA ATTAAGAAAT GATTTAATCT 4550 

TTTGTTTCTA GGGTTATTCA TGGGAGGTCA CAACTGCTGA AGGTGGACAG 4600 

40 

GGCCTCAATG AGCTCTTAAG CTCCCGAAAA AGTGTATTGA ATG GT A ACT A 4650 

TATTTGAATC CACTTATCTT CTTCTG AA AC ATATTTACAG AAATAGATGG 4700 

4 5 ATGGGTTGCA AG AATAAATT CAGTTTGCTC TTTCGGTATG AAGGAATTGT 4750 

AAATGGAATT GACATTAATG ATTGGAACCC CACCACAGAC AAGTGTCTCC 4800 

CTCATCATTA TTCTGTCG AT GACCTCTCTG GAAAGGTGTG TGG ATA GT AC 4850 

50 

CCTATATA AT AACATGTATA TCTGATCTAG TACTTTCTTT TTCTTTGCTA 4900 

GTTTGCTTCC CATGATGTTC TCACTAACTA ATCCTATGTG GTTTGGCATA 4950 

5 5 CTTGTCAGGC CAAATGTAAA GCTG AATTGC AGAAGGAGCT GGGTTTACCT 5000 
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GTAAGGGAGG ATGTTCCTCT GGTTAGATAC AAACCCCTAA GATATATATT 5050 

TTTTA AATCC CTAA A A A A AA CTTGCCG ATC ATCTC ATTAG CTTGATTC AC 5 1 00 

5 AGATTGGCTT TATTGG AAG A CTGG ATTACC AG AAAGGCAT TGATCTCATT 5 1 50 

AAAATGGCCA TTCCAGAGCT CATGAGGGAG GACGTGCAGT TTGTAAGTTC 5200 

ATATTCl 1 I 1 TCTTGAGACT AGAGTATAAA TCAAACATGT AGGTGTGGGG 5250 

10 

TGGTATAATA CAGACATAAG TTCCAGCTAT TGCTTCCATG AGAATTTTAA 5300 

TGCTATTCAG TAATATGCTA CTGCAAGTTT TGAAACAAAG TTGGAAGCAA 5350 

1 5 TAAATATATG TGTAGCACTG ACCATGC AGT GCC ACTATAG CTGGAATGTC 5400 

CTGTA GTCT A TGTGATCTAA CACACTCAAC AACATGTTTT CGCATACAAA 5450 

CACATGCGTG CGCGCAACAA ACATACTCTA CAATAAAATT GGCTTGGTGA 5500 

20 

ACTGCAGACA TGCTCTTATC TCCATTCCAA CATTTCTTGT TTCAACATTG 5550 

GCTGAAGACT AAGAGAAGGG GGACCCAGGG TGATGTAGCC AACTAGATCC 5600 

2 5 AGTAAGGAAG CTAGCCGAGC CTAGGAGGAT TCGCTTAGGT AGCTGGAACG 5650 

TAGGGTCTCT GACAGGGAAG CTTCGGGAGC TAGTCGATGC AGTGGTGAGG 5700 

AGAGGTGTTG ATATCCTTTG CGTCCAAGAA ACCAAATGTA GGGGACAGAA 5750 

30 

GGCGAAGGAG GTGGAGGATA CCGGCTTCAA GCTGTGGTAC ATGGGACGGC 5800 

TGCAAACAGA AATGGCGTAG GCATCTTGAT CAACAAGAGC CTTAAGTATG . 5850 

3 5 GAGTGGTAGA CGTCAAGAGA CGTGGGGACC GG ATT ATCCT CGTCAAGCTG 5900 

GTAGTTGGGG ACTTA GTTCT CAATGTTATC AGCGTGTATG CCCCGCAAGT 5950 

AGGCCACAAT GAGAACGCCA AGAGGGAGTT CTGGG AAGGC CTGG AAGACA 6000 

40 

TGGTTAGGAG TGTACCGATT GGCGAGAAGC TCTTCATAGG AGGAGACCTC 6050 

A ATGGCC ACG TGGGTAC ATC T A AC AT AG GT TTTG AAGGGG C AC ATGGGGG 6 1 00 

4 5 CTTTGGCTAT GGC ATC A AG A ATC A AG AAG A AG ATGTCTTA CGCTTTGCTC 6 1 50 

TAGCCTACGA CATGATTGTA GCTAACACCC TCTTTAGAAA GAG AG A ATC A 6200 

CATCTGGTGA CTTTTAGTAG TGGCCAACAC TAGCCAGATC GATTTCATCC 6250 

50 

TCTCGAGAAG AGAAGATAGG TGTGCGCGCC TAGACTGCAA GGTGATACCT 6300 

TCGGATTCGT GTCCAGCGGG ATAAGCGTGC CAAAGTCGCT AGAATG AAGT 6350 

5 5 GGTGGAAGCT CAAGGGGGAG GTAGCTCAGG CGTTCAAGGA GAGGGTCATT 6400 
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AGGGAGGGCC CTTGGGAGGA AGGAGGGGAT GCGGACAATG TGTGGATGAA 6450 

GATGGCG ACT TGCATTCGTA AGGTGGCCTC GGAGGAGTGT GGAGTGTCCA 6500 

5 GGGGATGGAG AAGCGAAGAT AAGG ATACCT GGTGGTGGAA TGATGATGTC 7000 

CAGAAGGCAA TTAAAGAGAA GAAAGATTGC TTTAGACGCC TATACTTGGA 7050 

TAGG AGTGCA GTCAACATAG AAAAGTAC A A GATGGCGAAG AAGGCCGCAA 7 1 00 

10 

AGCGAGCTGT CAGTGAAGCA AGGGGTCGGG CATATGAGGA TCTCTACCAA 7 1 50 

CGGTTAGGCA CGAAGGAAGG CGAAAGGGAC ATCTATAAGA TGGCCAAGAT 7200 

CCGAGAGAGA GGAAGACGAG GGATATTGGC CAAGTCAAAT GCATCAAGGA 7250 

1 5 TGG AGCAG AC C AACTCTTGG TGAAGGACGA GG AG ATTAAG CATAGATGGC 7300 

GGGAGTACTT CGACAAGCTG TTCAATGGGG AGGATGAGAG TCCTACCATT 7350 

GAACTTGACG ACTCCTTTGA TGAGACCATC ATGCGTTTTA TGCGGCGAAT 7400 

CCAGGAGTCC GAGGTCAAGG AGGCTTTAAA AAGGAGGCAA GGCGATGGGC 7450 

CCTGATTGTA TCCCCATTGA GGTGTGGAAA GGCCTCGGGG ACATAGCGAT 7500 

2 0 AGTATGGCTA ACCAAGCTAT TCAACCTCAT TITFCGGGCA AACAAGATGC 7550 

C AG A AG A ATG GAGACGAAGT ATATTAGTAC CAATCATCAA ACAGGGGGGA 7600 

TGTTCAGAGT TGTACTAATT ACCATGGAAT TAAGCTGATG AGCCATACAA 7650 

TGAAGCTATG GGAGAGAATC ATTGAGCACC GCTTAAGAAG AATGACAAGC 7700 

GTGACCAAAA ATCAGTTTGG TTTCATGCCT GGGAGGTCGA CCATGGAAAC 7750 

2 5 CATTTTCTTG GTACGACAAC TTATGG AGAG ATACAGGGAG CAAAAG AAGG 7800 

ACTTGCATAT GGTGTTCATT GACTTG AAGA AGGCCTATAA TAAGATACCG 7850 

CGGAATGTCA TGTGGTGGGC CTTGGAGAAA CACAAAGTCC CAGCAAAGTA 7900 

CATTACCCTC ATCAAGGACA TGTACGATAA TGTTGTGACA AGTGTTCGAA 7950 

CAAGTGATGT CG AC ACT A AT GACTTCCCGA TTAAGATAGG ACTGCATCAG 8000 

3 0 GGGTCAGCTT TGAGCCCTTA TCTTTTTGCC TTGGTGATGG ATGAGGTCAC 8050 

A AGGG ATATA C AAGG AG ATA TCCCATGGTG TATGCTCTTT GTGG ATG ATT 8 1 00 

TGGTGCTAGT TGACGATAGT CGGGCGGGGG TAA ATA AC A A GTTAG AGTTA 8 1 50 

TGGAGACAA A CCTTGGAATC GAAAGGGTTT AGGCTTAGTA GAACTAAAAC 8200 

CGAGTACATG ATGTGCGGTT TCAGTACTAC TAGGTGTGAG GAGG AGGAGG 8250 
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TT A GCCTTG A TGGGCAGGTG GTACCCCAGA AGGACACCTT TCGATATTTG 8300 

GGGTCAATGC TGCAGGAGGA TG G G G GT ATT GATGAAGATG TGAACCATCG 8350 

A ATCA AAGCT GGATGGATGA AGTGGCGCCA AGCTTCTGGC ATTCTTTGTG 8400 

ACAAGAGAGT GCCACAAAAG CTAAGGCAAG TTCTACAGGA CGGCGGTTCG 8450 

5 ACCCGCAATG TTGTATGGCG CTGAGTGTTG GCCGACTAAA AGGCGACATG 8500 

TTCAACAGTT AGGTGTGGCG GAGATGCGTA TGTTGAGATG GATGTGTGGC 8550 

CACACGAGGA AGRATCGAGT CCGGAATGAT GATATACGAG ATAGAGTTGG 8600 

GGTAGCACCA ATTGAAGAGA AGCTTGTCCA ACATCGTCTG AGATGGTTTG 8650 

GGCATATTCA GCGCACGCCT CCGAAAACTC CAGTGCATAA CGGACGGCTA 8700 

1 0 AAGCGTGCGG AGAATGTCAA GAGAGGGCGG GGTAGACCGA ATTTGACATG 8750 

GGAGGAGTCC GTTAAGAGAG ACCTGAAGGT TTGG AGTATT ACGAAAGAAC 8800 

TAGCTATGGA CARGGGTGCG TGGAAGCTTG TTATCCATGT GCCAGAGCCA 8850 

TGAGTTGATC ACGAG ATCTT ATGGGTTTCA CCTCTAGCCT ACCCCAACTT 8900 

GTTTGGGACT AAAGGCTTTG TTGTTGTTGT TGTTGTTGTT GTTGTAGCCA 8950 

1 5 ACTAA ATCCA GTTGATCAGT GGTTTTTACT CTTATTTTTA C AGGTCATGC 9000 

TTGGATCTGG GGATCCAATT TTTGAAGGCT GGATGAGATC TACCGAGTCG 9050 

AGTT AC A AGG ATA A ATTCCG TGG ATGGGTT GGATTTAGTG TTCCAGTTTC 9 1 00 

CCAC AG A ATA ACTGC AGGGT ATGCCG AG AA CTTCTTA AC A AG ACCTTCGT 9 1 50 

TATCAGCTTG GATATATTAT AATGTTCAAA ACATTTATGT CTCTCTTTTT 9200 

2 0 GTGCAGTTGC G ATATATTGT TAATGCCATC CAGGTTTG AA CCTTGTGGTC 9250 

TTAATCAGCT ATATGCTATG C A AT ATGGT A CAGTTCCTGT AGTTCATGGA 9300 

ACTGGGGGCC TCCGAGTAAG ACAACTGCCT TGAAAATTAT CGTTATCTTG 9350 

GCTCCAACGC AAATGTTCTA ATTGGCTCGT GTATTCAACA GGACAC AGTC 9400 

GAGACCTTCA ACCCTTTTGG TGCAAAAGGA GAGGAGGGTA CAGGGTACGC 9450 

2 5 ACTGCTCAAT TTTAGCTAAC TTTCAGTTTA TCTTTTTGCA ATGTCTTGGG 9500 

GGTTCATTGC GCCATAAATC AACTTGTGAT AATTAACTGT TACTGTTCTG 9550 

TACTTGCAGG TGGGCGTTCT CACCGCTAAC CGTGGACAAG ATGTTGTGGG 9600 

TAAGTTTTTG CTG AGCTCTT GTCCGGTTAT AGGATCGACC TTGGCTGTAG 9650 



WO 99/14314 




U98/00743 



- 94 - 

CATGGTACCT TAGTGCCCCT TGTATATAG A CCTAACCTG A TGG ACTCACT 9700 

TTGTCTACAC TAATCATAGT AGTCGATTGC CCGGAGGCGT TTTGCTTGGA 9750 

TTCTGCTAAT TTAATTTTCA TG AC GAT A AC TCATACCATG GTTTGGTTCT 9800 

CCGATGGGGG CCAGAATGGC GTCTAGTGTC TGCGATCTGT GTAACTAGCC 9850 

5 AATGCCGGGT TGTTCCAAGT GAAAATTTAC CTTTTGACCA TTGTGCAGGC 9900 

ATTGCGAACC GCGATGTCGA CATTCAGGGA GCACAAGCCG TCCTGGGAGG 9950 

GGCTCATGAA GCGAGGCATG ACGAAAGACC ATACGTGGGA CC AT GCCGCC 10000 

G AGC AGTACG AGCAG ATCTT CG A ATGGGCC TTCGTGG ACC A ACCCTACGT 1 0050 

CATGTAGACG GGGACTGGGG AGGTCGAAGC GCGGGTCTCC TTGAGCTCTG 10100 

1 0 AAGACATGTT CCTCATCCTT CCGCGGCCCG GAAGGATACC CCTGTACATT 10150 

GCGTTGTCCT GCTAC AGTAG AGTCGC A ATG CGCCTGCTTG CTTGGTCCGC 1 0200 

CGGTTCGAGA GTAGATGACG GCTGTGCTGC TGCGGCGGTG ACAGCTTCGG 10250 

GTGG ATGACA GTTACAGTTT TGGGG AATA A GG AAGGG ATG TGCTGCAGGA 1 0300 

TGGTTAACAG CAAAGCACCA CTCAGATGGC AGCCTCTCTG TCCGTGTTAC 10350 

1 5 AGCTGAAATC AG A A ACC A AC TGGTGACTCT TTAGCCTTAG CGATTGTGAA 10400 

GTTTGTTGC A TTCTGTGTAT GTTGTCTTGT CCTTAGCTG A C A A ATATTTG 1 0450 

ACCTGTTGG A TA ATTCTATC TTTGCTGCTG TTTTTCTTTT GGTC A A A AG A 1 0500 

GGGGTTCCCT CCGATTTCAT TAACGAAACC ACC AAA ATA A CAGCACCCAG 10550 

TGCAGGTCTC AGGTTCAGAT ATACTTAAGA CTACTAAATC TAACAGCAGC 10600 

2 0 TA A A AAGCTT AA AG ATTCAG GCG ACATAAC CGAAC A AAAT CC AC AACCG A 1 0650 

AGGG ACCAA A GC AGG AC A AG TA A AAAGGC A GNCG ACAC AA AGCGC AGGTC 1 0700 

GCTGAAAAGG CA AGC AG AC A GAGGTCTGCA TTCTGTCAAC ACCACTTGTG 10750 

AAAAATGAAG AGAAGATCGA GAATTCCCGG GAATCCG 10787 



(2) INFORMATION FOR SEQ ID NO: 14: 
2 5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 647 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 
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(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 1.. 647 

(D) OTHER INFORMATION:/product= "deduced amino acid 
sequence for SSS I" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Ala Ala Thr Gly Val Gly Ala Gly Cys Leu Ala Pro Ser Val Arg 
15 10 15 

Leu Arg Ala Asp Pro Ala Thr Ala Ala Arg Ala Ser Ala Cys Val Val 
20 25 30 



Arg Ala Arg Leu Arg Arg Leu Ala Arg Gly Arg Tyr Val Ala Glu Leu 

20 35 40 45 

Ser Arg Glu Gly Pro Ala Ala Arg Pro Ala Gin Gin Gin Gin Leu Ala 
50 55 60 

2 5 Pro Pro Leu Val Pro Gly Phe Leu Ala Pro Pro Pro Pro Ala Pro Ala 

65 70 75 80 



Gin Ser Pro Ala Pro Thr Gin Pro Pro Leu Pro Asp Ala Gly Val Gly 
85 90 95 

Glu Leu Ala Pro Asp Leu Leu Leu Glu Gly lie Ala Glu Asp Ser lie 
100 105 110 



Asp Ser lie lie Val Ala Ala Ser Glu Gin Asp Ser Glu lie Met Asp 

35 115 120 125 

Ala Asn Glu Gin Pro Gin Ala Lys Val Thr Arg Ser lie Val Phe Val 

130 135 140 

4 0 Thr Gly Glu Ala Ala Pro Tyr Ala Lys Ser Gly Gly Leu Gly Asp Val 

145 150 155 160 



Cys Gly Ser Leu Pro lie Ala Leu Ala Ala Arg Gly His Arg Val Met 
165 170 175 

Val Val Met Pro Arg Tyr Leu Asn Gly Ser Ser Asp Lys Asn Tyr Ala 
180 185 190 



Lys Ala Leu Tyr Thr Gly Lys His lie Lys lie Pro Cys Phe Gly Gly 

50 195 200 205 

Ser His Glu Val Thr Phe Phe His Glu Tyr Arg Asp Asn Val Asp Trp 
210 215 220 

55 Val Phe Val Asp His Pro Ser Tyr His Arg Pro Gly Ser Leu Tyr Gly 

225 230 235 240 



Asp Asn Phe Gly Ala Phe Gly Asp Asn Gin Phe Arg Tyr Thr Leu Leu 
245 250 255 

Cys Tyr Ala Ala Cys Glu Ala Pro Leu lie Leu Glu Leu Gly Gly Tyr 
260 265 270 
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He Tyr Gly Gin Asn Cys Met Phe Val Val Asn Asp Trp His Ala Ser 
275 280 285 

Leu Val Pro Val Leu Leu Ala Ala Lys Tyr Arg Pro Tyr Gly Val Tyr 
290 295 300 

Arg Asp Ser Arg Ser Thr Leu Val He His Asn Leu Ala His Gin Gly 
305 310 315 320 

Leu Glu Pro Ala Ser Thr Tyr Pro Asp Leu Gly Leu Pro Pro Glu Trp 
325 330 335 



Tyr Gly Ala Leu Glu Trp Val Phe Pro Glu Trp Ala Arg Arg His Ala 
15 340 345 350 

Leu Asp Lys Gly Glu Ala Val Asn Phe Leu Lys Gly Ala Val Val Thr 
355 360 365 

20 Ala Asp Arg He Val Thr Val Ser Gin Gly Tyr Ser Trp Glu Val Thr 

370 375 380 



Thr Ala Glu Gly Gly Gin Gly Leu Asn Glu Leu Leu Ser Ser Arg Lys 
385 390 395 400 

Ser Val Leu Asn Gly He Val Asn Gly He Asp He Asn Asp Trp Asn 
405 410 415 



Pro Thr Thr Asp Lys Cys Leu Pro His His Tyr Ser Val Asp Asp Leu 
30 420 425 430 

Ser Gly Lys Ala Lys Cys Lys Ala Glu Leu Gin Lys Glu Leu Gly Leu 
435 440 445 

3 5 Pro Val Arg Glu Asp Val Pro Leu He Gly Phe He Gly Arg Leu Asp 

450 455 460 



Tyr Gin Lys Gly He Asp Leu He Lys Met Ala He Pro Glu Leu Met 

465 470 475 480 

Arg Glu Asp Val Gin Phe Val Met Leu Gly Ser Gly Asp Pro He Phe 

485 490 495 



Glu Gly Trp Met Arg Ser Thr Glu Ser Ser Tyr Lys Asp Lys Phe Arg 
45 500 505 510 



Gly Trp Val Gly Phe Ser Val Pro Val Ser His Arg He Thr Ala Gly 
515 520 525 

Cys Asp He Leu Leu Met Pro Ser Arg Phe Glu Pro Cys Gly Leu Asn 
530 535 540 

Gin Leu Tyr Ala Met Gin Tyr Gly Thr Val Pro Val Val His Gly Thr 
545 550 555 560 

Gly Gly Leu Arg Asp Thr Val Glu Thr Phe Asn Pro Phe Gly Ala Lys 
565 570 575 



Gly Glu Glu Gly Thr Gly Trp Ala Phe Ser Pro Leu Thr Val Asp Lys 
60 580 585 590 

Met Leu Trp Ala Leu Arg Thr Ala Met Ser Thr Phe Arg Glu His Lys 
595 600 605 
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Pro Ser Trp Glu Gly Leu Met Lys Arg Gly Met Thr Lys Asp His Thr 
610 615 620 

5 Trp Asp His Ala Ala Glu Gin Tyr Glu Gin lie Phe Glu Trp Ala Phe 

625 630 635 640 

Val Asp Gin Pro Tyr Val Met 
645 

10 

(2) INFORMATION FOR SEQ ID NO: 15: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5072 base pairs 

(B) TYPE: nucleic acid 

1 5 (C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

2 0 (iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 
25 (ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATION: L.4993 

(D) OTHER INFORMATION :/function= "region containing 
promoter of SSS I" 

30 





(xi) SEQUENCE DESCRIPTION: SEQ ID NO 


: 15: 










TCTAGATGCA 


TGCTGGATAG 


CGGTCGATGT 


GTGGAGTAAT 


AGTAGTAGAT 


GCAGAATCGT 


60 


35 


TTCGGTCTAC 


TTGTCGCGGA 


CGTGATGCCT 


ATATACATGA 


TCATACCTAG 


ATATTCTCAT 


120 




AACTATGCTC 


AATTCTATCA 


ATTGCTCGAC 


AGTAATTCGT 


TTACCCACCG 


TAATACTTAT 


180 


40 


GATCTTGAGA 


GAAGTCACTA 


GTGAAACCTA 


TGCCCCCCAG 


GTCTATTTTG 


CATCATATTA 


240 


ATCTTCCAAT 


AC TT AG TT AT 


TTC CATTGCC 


GTTTATTTTA 


CTTTGTATCT 


TTATTTCTTT 


300 




T T ATTAT AAA 


AAATACCAAA 


AATATTATCT 


TATCATATCT 


ATCAGATCTC 


ATTCTCGTAA 


360 


45 


GTGACCGTGA 


AGGGATTGAC 


AACCCCTTTA 


TCGTGTTGGT 


TGCGAGGTTC 


TTGTTTGTTT 


420 




GTGTAGGTGC 


GTGTGACTCG 


CACGTCTCCT 


ACTGGATTGA 


TACCTTGGGT 


TTTCAAAAAC 


480 


50 


TGAGAAAAAT 


ACTTACGCTA 


CTTTACTGCA 


TAACCCTTTC 


CTCTTTAAAA 


AAAAAAACCA 


540 


ACGTAGTATT 


CAAGAGGTAG 


CACGCTACCA 


TCCTCTCCAA 


CAGGAGCGCG 


GAGATCTTTG 


600 




TCCGGCAGGT 


TGATGCGGGC 


CGGGGAAGAA 


CTCCAGCTGC 


CTTGGCCAGC 


TTGGTCGTGA 


660 


55 


GCCGCCCCAG 


CGGCGTCTTG 


AACCTGTCCA 


CGTAGCGCTC 


CCTGACACGC 


GGCGTGAACT 


720 




GAGAAGGCTT 


GTCGATGAAC 


TCCAGCTGTT 


GTGCCAGCCT 


AGCTTGCGCC 


TTCTTCTGCT 


780 


60 


GGGTCATGCC 


CTTCGAGAAA 


CCCACCTTGG 


CCACCCTTGT 


GCTTGAGCGG 


CGCGCCACCT 


840 


CAGCAGGCGG 


CGGCGTGGGG 


ATGAAGAGGG 


TGTCTGCTTC 


CGGAGCAGGC 


GGGTCGGCGT 


900 
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TGAACTTGAA AGGCGGTGGC CCCATGATGG ATGGGGGGAG CATGCCAAAG ACTTGGTTGA 9 60 

GGAAAGTGGT GTTGGCGTCC ACCTCCAGTG CCTGCAGTTT GGAAGCCAGA CGATTGGCGT 102 0 

5 CGATCTCTGG CTCCGGCTGG AAGGAGGCTC GACGCTCCGG TGTGCCAGAA CGCAAAGGGA 1080 

GGAGCGGCAG CTCTGGCTGA GCAGACCCCG CGCCCATGTA CTCTGCATTG GGCCAAGGCT 114 0 

GCAGGGGCAA GCCACCGGGA TGGGGGCGCG AGGTGGACTG CGCACCGGAG GAAGGCCAAG 1200 

10 

CTCAACCTCG GTGAGGTTCG CCCCAGACCA GGGCGGCAGG CTCGGGTCCA CAAAGGGC C A 1260 

AACCGCCTCG TCCGCCCCGA AACTGTCCAG GACAGACGGC GGACGACGGA AGGCCGTGTC 13 20 

15 GTCGAGCTCG AGCAGCAGAG GGTCCGTGCG GGTGATGTCT TGCCAAATGG ACTCCACCTC 13 80 

CAGCAGGAAG GGGGACTGGT CCATCGCCCC TGGCCAAGCC ACTGGTACGC CAAAGATGGC 1440 

ATCAGCAGCG TTTGCACCAG GGGGAGCAGC CACACCTTGG AGG AC AGGG A GGGTGCGGAC 1500 

20 

GTCGACGGCA GCAAAACGTG GCTGGAGCAA GTTGCCGTCG CGTGCCGGCC TCGGCGAGCG 156 0 

CGAGCGGCTG TAGGAGCGCT CGGTGCCCTC AGACTCGGAC AGTGCGCCAG TGGGAGAGCC 162 0 

2 5 ATGGCGACGC CGGCCACCAC TGGACGTGCC ATGGCGCTGG TCCTGACGGC GCCTGGATGG 1680 

CCCGTCCTCG CGGGCAGCTC CACCTGAGCG GCACCCGAGG AGCACACCCC GCCAAGCTGG 1740 

GCCAGGGCGG CTGCGGCGAC GGCGACGGCC GCGGTCGCGG TCTGCACCAT CATCTTCATC 1800 

30 

TTCGTCATCG TGGCGCCTCG GACAAGGATG CTCGCTGTCA CCGACGCGAG GGACGTGAGC 1860 

CGGCTCAGCC CGCCCTTCCT CGACGTGGCG AGCCCTGCGG ATATGCTCCT CGAGCGGCCA 1920 

35 TTGGGGGTCG TTGGCGCGCG GCATCTCGGG GTCGCGGTCA GCTATCGGGG TGTAGTCCTT 1980 

TGTGGTGTCC AGGTGGATGA GCAGAGAGAA ATCCGGCCCC TCTAGCCCCT CGTCCCGGGG 2 040 

GCAGCCCTCC GGCAGCGTCT GGCGGCCCCT GGGGTCCAGG GGTCGATCGA TGATGGAGAA 2100 

40 

CCCCCTTTTG GTGGGGATGT CGTCCGGACT CCATGCCCAC ACCCAGGCAA AGAGGCAGGC 2160 

CGTGTTGGAG AGGGAGGTCG TCTGCCGCTC CAACCAGTCG ACGTGGCATG TCTTCCCGAG 2 22 0 

45 CGCATCCTGC CCCGCCTCCT TGTTCCAGGA CTGCACCGGC ATGTTCTCGA CGGCGATGCG 22 80 

GCAGTAGTAC CGCCAGACAC GGCGGTGGCC GTGTGCCGAT GGTGACCAGG CCGACAGGGA 23 4 0 

GAGCGCGACG CCCCAGCAGG AGACGACCCC AGCGTCGAAA GCGATGTCCC GGTGCCTGAA 2400 

50 

GTGGACGAGC CCAGAGATGG CCAGGCGCAT TGACGCGGGG AAGGGGAAGG AGTTAGGATG 2460 

GGCGACGCGG CCGGAGTGAA CCGCGGCGTG GTGGCCGACG GGGCTGGAGA GGCAGAGGCG 2 520 

55 GAGTCATCCG AGAGAGGTGT ATCAGTGGCT CTGCACAATA CCCAGTGTCG CCACATCATA 2 580 

TCCTGCTGAA TAACCACACA TGTGTACTGT CGTTAAATAA ATCATTGGTC ACGCGAACCC 2 640 

G G AAAAAG AC GGCGAAAAAT TCACGGACAC ACGACTAGTA GTACCCAATA TACTCGGCAA 2700 

60 

AAACAGTGAC ACGTCGTTTT GCGTTGTCGG CCGGTGTTGT CGAGTCATTG TACTATGTTT 2760 

TGTCGTTTCT TTCTTTTCTC CAAATCGACA AACCGTTTGT CTTTGGTTAA AAAACAGAAA 2 82 0 

65 CATACAAAAT CAAATGAATG CATTCAAGGG CCGGTAATCC AATTCTGAGC CCAGGCTCAG 2 880 

CTACACCCGC CCTTACAAAA AAATCAAAAT AAATACTAGA AAAATTCAAA AAATTCCAAT 2940 
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TTGTTTGTGC GTGGTAGATA ATTTGATGCG TGAGGTACGC TTCAATTTTC AAATTATTTG 3 000 
GACATCTGAG CAGCTCTCAG CAAAAAAGAC AAATTCGGGG TCTGTAAAAA TGTTTACTGT 3 0 60 

5 

TCATGCACTG TTCTGACCCG ATTTGTCTTT TTTGCTGAGA GCTTCTCAGA AGTCCAAATG 312 0 



AGCTAAAATT TTGAGCGGAG CTTACGTGAT AAAATGTCTA TCATGCAAAA AAGGATTGGA 3180 

10 ATTTTTTGAA TTTTTTTTAT TTTTTGTGAT TTGTTTCCTG GACGGGTGCA GATAAGCCTG 3 24 0 

GGCACCGAAA CGCCGCACTC AGGCTCATCC TTTTCTATAA AAGAAAAGAA ATACATACAA 33 00 

TTTCCCTCTG TTTTTTGAGC AAGGGGCACC ACCCACCAAA GAGTTTTCAA CTCACATGGT 3 3 60 

15 

ATTAGAGCAT CTACAGCCGG GCGTCTCAAA CCAGCCTCAT ACGCTTGAGC GGGTCGCCTT 3 42 0 

GGTCACGATT TTTTGACCCA GACGGGCCCC TCAAACGGTC CTTAAACGCC CAGGCTGACC 3 480 

20 GACAACCCAC ATATCCAGCC CAAATATGGG GTGGATATGG GGGCGCCCGG GCACGCCAGC 3 540 

CCGCGGACAC CACACATCTT CAGTTTCTAA TTTGAGATAT CCGGATGTGG AATGCGTTTT 3 600 

TGAGGGGTGA CCGGTCCCTG TCCGTGGATG CGCCCGGACG TTTGAGGGGT TGGATTTGCC 3 660 

25 

AAGTCTGATT AGAGATGCTC TTAGGTGTTC CACCCCCATC CCTTGATGGC TAGGGCAAAC 372 0 



TCTCCCCTCC AAACTTTGTC GGCGAGCCTG TGGATTC TTC TCTCCTCTGC CCGCTGCTCC 37 80 
3 0 GGCGGCTGAT GGCGGGGAGG AGAATCCCGG TGTCTTCGCT TGGTTAGTTG TTTAAGTTAC 3 840 
GTACTTTTTT AGTCCTCGCA GGTGCGGCGT TCGGACGTAT GGTCGTGCTT CTTTTTTGAG 3900 
TTTGTCTTCC GGGCTCTGAT CCTCCTCGAG TTCGTCCATC TGGACGTACT CGACGGAGCT 3 9 60 

35 

CCGGCATAGA TTCCTATCAT CGTCTTGGTG AGGTGAGGTT ATGGTTTCTT GTCATGTGGG 4 02 0 



CAGATTTGGT GCCAGATGCT TCATATCTAT TCAAGGGTTC AGCGGCAACA ACTGCGGCTC 4080 

4 0 CAGAGCGATG GTCC TTAAGG GCACGTGCAC GAAGACTTCA CGGCTGTTAT CGACAAGGTC 4140 

AAGCCGGCTC CGATAGGGGA GCAGCGACAG CGGCGCGTCA ACCGCTCGTT CTGGCGGCAG 4200 

TAGTGGTCGT TCGGTGCTCT CGGAACCTCG ATGTAATTTT TATGATTTTA GAGATGCTTT 42 60 

45 

GTACTTCCGA TCGATGAACT CTGATAATAG ATATCTCTTC TCTCGCAAAA AAAGAGAGTT 43 2 0 

TTC AAC TG AA AACAAAAGAG TTTCACTAGT TCTTCTTTTA GAAACAGAGT TTCACTAGCA 43 8 0 

50 CTTTTTTTTG CGAGAAGTCG AGTTTCACTA AGTACTAAAC CCACGCAATT ATTCTCAAAA 444 0 

AAAAAACCCA CGCAACTGTC TGGATCCATC TTCGTTTTTT CCCCGAGAAT CGTCTGGATC 4 500 

CATTTTCGTG TGCGAGGCAT CCTCTCATTT TGCACGGCCC AGCTCTCTTC TCGCCGGCGT 4 560 

55 

ACGCTGCTAC ATG TCGGCAC TCCACGCAAA CAAAAAGAAG CCCAACCGAA AACGCACGCG 462 0 



CCTTTCCAGG CTCACCACGG AAAAAAATAC CACGCGCCGC TCACGAGCAA ACCGTGACAA 4680 
60 CAGCCAGCCA GATATGGCAA CGGAGGCACG GGCCGCACAC AGCCACTGAA AACCGCAGCT 4740 



GCTCTTCCGT CCGTCCGTCC CTCCGCCCGT CCGCGCCACT CCACTCGCCT TGCCCCACTC 4800 

CCACTCTTCT CTCCCCGCGC ACACCGAGTC GGCACCGGCT CATCACCCAT CACCTCGGCC 4860 

65 

TCGGCCACCG GCAAACCCCC CGATCCGCTT TTGCAGGCAG CGCACTAAAA CCCCGGGGAG 4920 
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CGCGCCCCGC GGCAGCAGCA GCACCGCAGT GGGAGAGAGA GGCTTCGCCC CGGCCCGCAC 4 9 80 

CGAGCGGGGC GATCCACCGT CCGTGCGTCC GCACCTCCTC CGCCTCCTCC CCTGTCCCGC 504 0 
5 GCGCCCACAC CCATGGCGGC GACGGGCGTC GG 507 2 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1706 base pairs 

1 0 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

15 

(iii) HYPOTHETICAL: NO 

(vi) ORIGINAL SOURCE: 
(A) ORGANISM: triticum tauschii 
2 0 (F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: L. 1706 

2 5 (D) OTHER INFORMATION :/product= "partial cDNA for 

hexaploid wheat DBE" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

3 0 GCT GTG TCG AAG CTT GAC TAT TTG AAG GAG CTT GGA 

Ala Val Ser Lys Leu Asp Tyr Leu Lys Glu Leu Gly 
15 10 

GAA TTA ATG CCC TGC CAT GAG TTC AAC GAG CTG GAG 
3 5 Glu Leu Met Pro Cys His Glu Phe Asn Glu Leu Glu 

20 25 

TCT TCC AAG ATG AAC TTT TGG GGA TAT TCT ACC ATA 
Ser Ser Lys Met Asn Phe Trp Gly Tyr Ser Thr lie 
40 35 40 

CCA ATG ACG AGA TAC ACA TCA GGC GGG ATA AAA AAC 

Pro Met Thr Arg Tyr Thr Ser Gly Gly lie Lys Asn 

50 55 60 

45 

GCC ATA AAT GAG TTC AAA ACT TTT GTA AGA GAG GCT 

Ala lie Asn Glu Phe Lys Thr Phe Val Arg Glu Ala 
65 70 75 

50 ATT GAG GTG ATC CTG GAT GTT GTC TTC AAC CAT ACA 
lie Glu Val lie Leu Asp Val Val Phe Asn His Thr 
85 90 

GAG AAT GGT CCA ATA TTA TCA TTT AGG GGG GTC GAT 
55 Glu Asn Gly Pro lie Leu Ser Phe Arg Gly Val Asp 
100 105 

TAT ATG CTT GCA CCC AAG GGA GAG TTT TAT AAC TAT 
Tyr Met Leu Ala Pro Lys Gly Glu Phe Tyr Asn Tyr 
60 115 120 



GTT AAT TGT ATT 48 
Val Asn Cys lie 
15 

TAC TCA ACC TCT 96 
Tyr Ser Thr Ser 
30 

AAC TTC TTT TCA 144 
Asn Phe Phe Ser 
45 

TGT GGG CGT GAT 192 
Cys Gly Arg Asp 



CAC AAA CGG GGA 2 40 
His Lys Arg Gly 
80 

GCT GAG GGT AAT 28 8 
Ala Glu Gly Asn 
95 

AAT ACT ACA TAC 33 6 
Asn Thr Thr Tyr 
110 

TCT GGC TGT GGG 3 84 

Ser Gly Cys Gly 

125 
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AAT ACC TTC AAC TGT AAT CAT CCT GTG GTT CGT CAA TTC ATT GTA GAT 43 2 

Asn Thr Phe Asn Cys Asn His Pro Val Val Arg Gin Phe He Val Asp 
130 135 140 

• 5 TGT TTA AGA TAC TGG GTG ATG GAA ATG CAT GTT GAT GGT TTT CGT TTT 4 8 0 

Cys Leu Arg Tyr Trp Val Met Glu Met His Val Asp Gly Phe Arg Phe 
145 150 155 160 

GAT CTT GCA TCC ATA ATG ACC AGA GGT TCC AGT CTG TGG GAT CCA GTT 52 8 

10 Asp Leu Ala Ser He Met Thr Arg Gly Ser Ser Leu Trp Asp Pro Val 

165 170 175 

AAC GTG TAT GGA GCT CCA ATA GAA GGT GAC ATG ATC ACA ACA GGG AC A 57 6 

Asn Val Tyr Gly Ala Pro He Glu Gly Asp Met He Thr Thr Gly Thr 
15 180 185 190 

CCT CTT GTT ACT CCA CCA CTT ATT GAC ATG ATC AGC AAT GAC CCA ATT 624 

Pro Leu Val Thr Pro Pro Leu He Asp Met He Ser Asn Asp Pro He 

195 200 205 

20 

CTT GGA GGC GTC AAG CTC ATT GCT GAA GCA TGG GAT GCA GGA GGC CTC 67 2 

Leu Gly Gly Val Lys Leu He Ala Glu Ala Trp Asp Ala Gly Gly Leu 
210 215 220 

2 5 TAT CAA GTA GGT CAA TTC CCT CAC TGG AAT GTT TGG TCT GAG TGG AAT 720 

Tyr Gin Val Gly Gin Phe Pro His Trp Asn Val Trp Ser Glu Trp Asn 
225 230 235 240 

GGG AAG TAC CGG GAC ATT GTG CGC CAA TTC ATT AAA GGC ACT GAT GGA 768 

3 0 Gly Lys Tyr Arg Asp He Val Arg Gin Phe He Lys Gly Thr Asp Gly 

245 250 255 

TTT GCT GGT GGT TTT GCC GAA TGT CTT TGT GGA AGT CCA CAC CTA TAC 816 

Phe Ala Gly Gly Phe Ala Glu Cys Leu Cys Gly Ser Pro His Leu Tyr 
35 260 265 270 

CAG GCA GGA GGA AGG AAA CCT TGG CAC AGT ATC AAC TTT GTA TGT GCA 864 

Gin Ala Gly Gly Arg Lys Pro Trp His Ser He Asn Phe Val Cys Ala 

275 280 285 

40 

CAT GAT GGA TTT ACA CTG GGT GAT TTG GTA ACA TAT AAT AAC AAG TAC 912 

His Asp Gly Phe Thr Leu Gly Asp Leu Val Thr Tyr Asn Asn Lys Tyr 
290 295 300 

4 5 AAT TTA CCA AAT GGG GAG AAC AAT AGA GAT GGA GAA AAT CAC AAT CTT 9 60 

Asn Leu Pro Asn Gly Glu Asn Asn Arg Asp Gly Glu Asn His Asn Leu 
305 310 315 320 

AGC TGG AAT TGT GGG GAG GAA GGA GAA TTC GCA AGA TTG TCT GTC AAA 100 8 

50 Ser Trp Asn Cys Gly Glu Glu Gly Glu Phe Ala Arg Leu Ser Val Lys 

325 330 335 

AGA TTG AGG AAG AGG CAG ATG CGC AAT TTC TTT GTT TGT CTC ATG GTT 105 6 

Arg Leu Arg Lys Arg Gin Met Arg Asn Phe Phe Val Cys Leu Met Val 
55 340 345 350 

TCT CAA GGA GTT CCA ATG TTT TAC ATG GGC GAT GAA TAT GGC CAC ACA 1104 

Ser Gin Gly Val Pro Met Phe Tyr Met Gly Asp Glu Tyr Gly His Thr 

355 360 365 



60 



AAA GGG GGC AAC AAC AAT ACA TAC TGC CAT GAT TCT TAT GTC AAT TAT 1152 
Lys Gly Gly Asn Asn Asn Thr Tyr Cys His Asp Ser Tyr Val Asn Tyr 
370 375 380 
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TTT CGC TGG GAT AAA AAA GAA CAA TAC TCT GAC TTG CAC AGA TTC TGC 12 00 
Phe Arg Trp Asp Lys Lys Glu Gin Tyr Ser Asp Leu His Arg Phe Cys 
385 390 395 400 

5 

TGC CTC ATG ACC AAA TTC CGC AAG GAG TGC GAG GGT CTT GGC CTT GAG 124 8 
Cys Leu Met Thr Lys Phe Arg Lys Glu Cys Glu Gly Leu Gly Leu Glu 
405 410 415 

10 GAC TTT CCA ACG GCC GAA CGG CTG CAG TGG CAT GGT CAT CAG CCT GGG 12 9 6 
Asp Phe Pro Thr Ala Glu Arg Leu Gin Trp His Gly His Gin Pro Gly 
420 425 430 

AAG CCT GAT TGG TCT GAG AAT AGC CGA TTC GTT GCC TTT TCC ATG AAA 1344 
15 Lys Pro Asp Trp Ser Glu Asn Ser Arg Phe Val Ala Phe Ser Met Lys 
435 440 445 

GAT GAA AGA CAG GGC GAG ATC TAT GTG GCC TTC AAC ACC AGC CAC TTA 13 92 
Asp Glu Arg Gin Gly Glu lie Tyr Val Ala Phe Asn Thr Ser His Leu 
20 450 455 460 



25 



CCG GCC GTT GTT GAG CTC CCA GAG CGC GCA GGG CGC CGG TGG GAA CCG 1440 
Pro Ala Val Val Glu Leu Pro Glu Arg Ala Gly Arg Arg Trp Glu Pro 
465 470 475 480 

GTG GTG GAC ACA GGC AAG CCA GCA CCA TAT GAC TTC CTC ACC GAC GAC 148 8 
Val Val Asp Thr Gly Lys Pro Ala Pro Tyr Asp Phe Leu Thr Asp Asp 
485 490 495 

3 0 TTA CCT GAT CGC GCT CTC ACC ATA CAC CAG TTC TCT CAT TTC CTC AAC 153 6 
Leu Pro Asp Arg Ala Leu Thr lie His Gin Phe Ser His Phe Leu Asn 
500 505 510 

TCC AAC CTC TAC CCC ATG CTC AGC TAC TCA TCG GTC ATC CTA GTA TTG 1584 
3 5 Ser Asn Leu Tyr Pro Met Leu Ser Tyr Ser Ser Val lie Leu Val Leu 
515 520 525 

CGC CCT GAT GTT TGA GAG ACA AAT ATA TAC AGT AAA TAA TAT GTC TAT 163 2 
Arg Pro Asp Val * Glu Thr Asn lie Tyr Ser Lys * Tyr Val Tyr 
40 530 535 540 

ATG TAG TCC TTT GGC GTA TTA TCA GTG TGC ACA ATT GCT CTA TTG CCA 1680 
Met * Ser Phe Gly Val Leu Ser Val Cys Thr lie Ala Leu Leu Pro 
545 550 555 560 

45 

GTG ATC TAT TCG ATA GCG GCC GCG AA 17 06 

Val lie Tyr Ser lie Ala Ala Ala 
565 

5 0 (2) INFORMATION FOR SEQ ID NO: 17: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
5 5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

60 

(vi) ORIGINAL SOURCE: 
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(A) ORGANISM: triticum tauschii 
(F) TISSUE TYPE: Endosperm 

(ix) FEATURE: 
5 (A) NAME/KEY: CDS 

(B) LOCATION: 1.. 9289 

(D) OTHER INFORMATION:/product= "genomic sequence of DBE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

10 

CGG GAC CGT CCC TTG GCA ACT TGG GTT ACG TTG GGA CCT GAC GCT TCG 4 8 
Arg Asp Arg Pro Leu Ala Thr Trp Val Thr Leu Gly Pro Asp Ala Ser 
570 575 580 

15 CTT ATC CGG TGT GCC CTG AGA CGA GAT ATG TGC AGC TCC TAT CGG ATT 9 6 
Leu lie Arg Cys Ala Leu Arg Arg Asp Met Cys Ser Ser Tyr Arg lie 
585 590 595 600 

TGT CGG CAC ATT CGG CGG CTT TGC TGG TCT TGT TTT ACC ATT GTC GAA 144 

2 0 Cys Arg His lie Arg Arg Leu Cys Trp Ser Cys Phe Thr lie Val Glu 

605 610 615 

ATG TCT TAT AAA CCG GGA TTC CGA GAC TGA TCG GGT CTT CCC GGG AGA 192 
Met Ser Tyr Lys Pro Gly Phe Arg Asp * Ser Gly Leu Pro Gly Arg 
25 620 625 630 

AGG TTT ATC CTT CGT TGA CCG TGA GAG CTT ATA ATG . GGC TAA GTT GGG 240 

Arg Phe lie Leu Arg * Pro * Glu Leu lie Met Gly * Val Gly 
635 640 645 

30 

ACA CCC CTG CAG GGT ATT ATC TTT CGA AAG CCG TGC CCG CGG TTA TGA 288 

Thr Pro Leu Gin Gly lie lie Phe Arg Lys Pro Cys Pro Arg Leu * 
650 655 660 

3 5 GGC AGA TGG GAA TTT GTT AAT GTC CGA TTG TAG AGA ACC TGT CAC TTG 33 6 

Gly Arg Trp Glu Phe Val Asn Val Arg Leu * Arg Thr Cys His Leu 
665 670 675 680 

ACT TAA TTT AAA ATT CAT CAA CCG TGT GTG TAG CCG TGA TGG TCT CTT 3 84 

4 0 Thr * Phe Lys lie His Gin Pro Cys Val * Pro * Trp Ser Leu 

685 690 695 

TTC GGC GGA GTC CGG GAA GTG AAC ACG GTT TGA GTT ATG CAT GAA CGT 43 2 
Phe Gly Gly Val Arg Glu Val Asn Thr Val * Val Met His Glu Arg 
45 700 705 710 

AAG TAG TTT CAG GAT CAC TCC TTG ATC ACT TCT AGC TCC GCG ACC GTT 480 

Lys * Phe Gin Asp His Ser Leu lie Thr Ser Ser Ser Ala Thr Val 

715 720 725 

50 

GCG TTG TTT CTC TTC TCG CTC TCA TTT GCG TAT GTT AGC CAC CAT ATA 52 8 

Ala Leu Phe Leu Phe Ser Leu Ser Phe Ala Tyr Val Ser His His lie 

730 735 740 

55 TGC TTA GTG TCT GCT GCA GCT CCA CCT CAT TAC CCC TTC CTT TCC TAT 57 6 
Cys Leu Val Ser Ala Ala Ala Pro Pro His Tyr Pro Phe Leu Ser Tyr 
745 750 755 760 



60 



AAG CTT AAA TAG TCT TGA TCT CGC GGG TGT GAG ATT GCT GAG TCC TCG 
Lys Leu Lys * Ser * Ser Arg Gly Cys Glu lie Ala Glu Ser Ser 
765 770 775 



624 
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TGA CTT ACA GAT TCT ACC AAA ACA GTT GCA GGT GTC GAC GAT GCC AGT 

* Leu Thr Asp Ser Thr Lys Thr Val Ala Gly Val Asp Asp Ala Ser 

780 785 790 

5 GCA GGT GAC GCA ACC GAG CTC AAG TGG GAG TTC GAC GAG GAA CGT GGT 

Ala Gly Asp Ala Thr Glu Leu Lys Trp Glu Phe Asp Glu Glu Arg Gly 

795 800 805 

CGT TAC TAT GTT TCT TTT CCT GAT GAT CAG TAG TGG AGC CCA GTT GGG 

10 Arg Tyr Tyr Val Ser Phe Pro Asp Asp Gin * Trp Ser Pro Val Gly 

810 ■ 815 820 

ACG ATC GGG GAT CTA GCA TTT GGG GTT ATC TTA ATT TCT TTT AGA TTT 

Thr lie Gly Asp Leu Ala Phe Gly Val lie Leu lie Ser Phe Arg Phe 

15 825 830 • 835 840 



20 



GAC CGT AAT CGG TCT ATG TGT GGA TTT TGG ATG ATG TAT GAA TTA TTT 

Asp Arg Asn Arg Ser Met Cys Gly Phe Trp Met Met Tyr Glu Leu Phe 

845 850 855 

ATG TAT TGT GTG AAG TGG CGA TTG TAA GCC AAC TCT CGT TAT CCC ATT 

Met Tyr Cys Val Lys Trp Arg Leu * Ala Asn Ser Arg Tyr Pro lie 

860 865 870 

2 5 CTT GTT CAT TAC ATG GGA TTG TGT GAA GAT GAC CCT TCT TGC GAC AAA 

Leu Val His Tyr Met Gly Leu Cys Glu Asp Asp Pro Ser Cys Asp Lys 

875 880 885 

ACC ACA ATG CGG TTA TGC CTC TAA GTC GTG CCT CGA CAC GTG GGA GAT 

3 0 Thr Thr Met Arg Leu Cys Leu * Val Val Pro Arg His Val Gly Asp 

890 895 900 

ATA GCC GCA TCG TGG GCG TTA CAC GCA AGT CTT CAT AGC AAC CAA AAC 

He Ala Ala Ser Trp Ala Leu His Ala Ser Leu His Ser Asn Gin Asn 

35 905 910 915 920 

TCC TCT CCG CAT TAC AAG CCA CCA ATC GCA GCC ACC ATG ACT TTC TTC 

Ser Ser Pro His Tyr Lys Pro Pro He Ala Ala Thr Met Thr Phe Phe 

925 930 935 

40 

ACC ACT GTC AAT GCC ATG AAA ATC TAT ATG TAG ACA TGT CCC ATT GCA 

Thr Thr Val Asn Ala Met Lys lie Tyr Met * Thr Cys Pro He Ala 

940 945 950 

4 5 TCG GCA AGA AAG CGA AGC TTC ACG GCA CAC CTT CAT GAA GCC TCT CTG 

Ser Ala Arg Lys Arg Ser Phe Thr Ala His Leu His Glu Ala Ser Leu 

955 960 965 



50 



60 



GCC GAA GAC AAG GAT GCG CCC GAC CGG ATC AAT TCC TAT CTA GAT ACC 
Ala Glu Asp Lys Asp Ala Pro Asp Arg He Asn Ser Tyr Leu Asp Thr 
970 975 980 



TAG TGG AGC CAT GCG CCA ATA GCG GAG ATC TCC GAG AGG AAG ACC GGA 
* Trp Ser His Ala Pro He Ala Glu He Ser Glu Arg Lys Thr Gly 
55 985 990 995 1000 



ACT CGT CGG ACG TCG GCG TCC AAA TCG AGG AGG CCG GCA TGA AGC ACA 

Thr Arg Arg Thr Ser Ala Ser Lys Ser Arg Arg Pro Ala * Ser Thr 
1005 1010 1015 

TCG AGG ATG GTG ATC CCC ATA CGG GTA GAT CGG GTC GGC CGC CAT CTC 

Ser Arg Met Val He Pro He Arg Val Asp Arg Val Gly Arg His Leu 
1020 1025 1030 
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ACA CCG AGA TTA GGA TGC TTA AAA CGG TTT TTT TGG CAC TAG CAT TAT 14 40 
Thr Pro Arg Leu Gly Cys Leu Lys Arg Phe Phe Trp His * His Tyr 
1035 1040 1045 

5 TTT GCA TCA TCC GTT GGA GAG AAC ATG AGA GAG CCC CAT TTC TTC CAC 14 8 8 
Phe Ala Ser Ser Val Gly Glu Asn Met Arg Glu Pro His Phe Phe His 
1050 1055 1060 

GGT TCT ACC TAT GGG ATC TTG TTC TGC TTG CAA CCG GGC CTC ACG GAA 153 6 
10 Gly Ser Thr Tyr Gly lie Leu Phe Cys Leu Gin Pro Gly Leu Thr Glu 
1065 1070 1075 1080 

AAC CCG CGC CAG CGG ACC CAC CCC ATG CTA GCA GGG CAC GGC ACC CGC 15 84 
Asn Pro Arg Gin Arg Thr His Pro Met Leu Ala Gly His Gly Thr Arg 
15 1085 1090 1095 

AGC GGC CGG TCC AAA TGG ACG GTG AGA ACC GCA ACG CGA CAC GCC CGG 163 2 

Ser Gly Arg Ser Lys Trp Thr Val Arg Thr Ala Thr Arg His Ala Arg 
1100 1105 1110 

20 

CAC TGT CAG CAA AGC GAG AGC GCG CGC ACG GCA CAC GCA CGC TCG GAC 168 0 

His Cys Gin Gin Ser Glu Ser Ala Arg Thr Ala His Ala Arg Ser Asp 
1115 1120 1125 

2 5 GAA CGG ACG GTG CGA TCG ATC CCT CCC CCC TCG CTC AAC CAC AGT AGT 17 2 8 

Glu Arg Thr Val Arg Ser lie Pro Pro Pro Ser Leu Asn His Ser Ser 
1130 1135 1140 

ACC CTG CCA CAC TAT CAC GCA CGC ACT CGA GTC ACA CCT CCC ACG AAG 177 6 

3 0 Thr Leu Pro His Tyr His Ala Arg Thr Arg Val Thr Pro Pro Thr Lys 

1145 1150 1155 1160 

AAC CAA CAG GAG GCG CGG ATC CCA CCG ATA AAT AAC CCC GCC TCG CCG 1824 
Asn Gin Gin Glu Ala Arg lie Pro Pro lie Asn Asn Pro Ala Ser Pro 
35 1165 1170 1175 

CTC CTC CCC AAA ATC AAT CAC CGA TCG CTC GGG GTT CCC GGC ATG ACG 1872 

Leu Leu Pro Lys lie Asn His Arg Ser Leu Gly Val Pro Gly Met Thr 
1180 1185 1190 

40 

ATG ATG GCC ATG GCC AAG GCG CCC TGC CTC TGC GCG CGC CCG TCC CTC 192 0 

Met Met Ala Met Ala Lys Ala Pro Cys Leu Cys Ala Arg Pro Ser Leu 

1195 1200 1205 

45 GCC GCG CGC GCG AGG CGG CCG GGG CCG GGG CCG GCG CCG CGC CTG CGA 19 68 
Ala Ala Arg Ala Arg Arg Pro Gly Pro Gly Pro Ala Pro Arg Leu Arg 
1210 1215 1220 

CGG TGG CGA CCC AAT GCG ACG GCG GGG AAG GGG GTC GGC GAG GTG TGC 2016 
50 Arg Trp Arg Pro Asn Ala Thr Ala Gly Lys Gly Val Gly Glu Val Cys 
1225 1230 1235 1240 

GCC GCG GTT GTC GAG GCG GCG ACG AAG GCC GAG GAT GAG GAC GAC GAC 2 064 
Ala Ala Val Val Glu Ala Ala Thr Lys Ala Glu Asp Glu Asp Asp Asp 
55 1245 1250 1255 

GAG GAG GAG GCG GTG GCG GAG GAC AGG TAC GCG CTC GGC GGC GCG TGC 2112 
Glu Glu Glu Ala Val Ala Glu Asp Arg Tyr Ala Leu Gly Gly Ala Cys 
1260 1265 1270 



60 



AGG GTG CTC GCC GGA ATG CCC GCG CCG CTG GGC GCC ACC GCG CTC GCC 2160 
Arg Val Leu Ala Gly Met Pro Ala Pro Leu Gly Ala Thr Ala Leu Ala 
1275 1280 1285 
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GGC GGG GTC AAT TTC GCC GTC TAC TCC GGT GGA GCC ACC GCC GCG GCG 22 08 
Gly Gly Val Asn Phe Ala Val Tyr Ser Gly Gly Ala Thr Ala Ala Ala 
1290 1295 1300 

5 CTC TGC CTC TTC ACG CCA GAA GAT CTC AAG GCG GTG GGG TTG CCT CCC 22 5 6 
Leu Cys Leu Phe Thr Pro Glu Asp Leu Lys Ala Val Gly Leu Pro Pro 
1305 1310 1315 1320 

GAG TAG AGT TCA TCA GCT TTG CGT GCG CCG CGC GCC CCC TTT TCT GGC 23 04 
10 Glu * Ser Ser Ser Ala Leu Arg Ala Pro Arg Ala Pro Phe Ser Gly 

1325 1330 1335 

CTG CGA TTT AAG TTT TGT ACT GGG GGA AAT GCT GCA GGA TAG GGT GAC 2 3 52 
Leu Arg Phe Lys Phe Cys Thr Gly Gly Asn Ala Ala Gly * Gly Asp 
15 1340 1345 1350 

GGA GGA GGT TTC CCT TGA CCC CCT GAT GAA TCG GAC TGG GAA CGT GTG 2 400 

Gly Gly Gly Phe Pro * Pro Pro Asp Glu Ser Asp Trp Glu Arg Val 

1355 1360 1365 

20 

GCA TGT CTT CAT TGA AGG CGA GCT GCA CGA CAT GCT TTA CGG GTA CAG 244 8 

Ala Cys Leu His * Arg Arg Ala Ala Arg His Ala Leu Arg Val Gin 

1370 1375 1380 

2 5 GTT CGA CGG CAC CTT TGC TCC TCA CTG CGG GCA CTA CCT TGA TAT TTC 249 6 

Val Arg Arg His Leu Cys Ser Ser Leu Arg Ala Leu Pro * Tyr Phe 
1385 1390 1395 1400 

CAA TGT CGT GGT GGA TCC TTA TGC TAA GGT GAT CAT ACT TTA GCT TTA 2 544 

3 0 Gin Cys Arg Gly Gly Ser Leu Cys * Gly Asp His Thr Leu Ala Leu 

1405 1410 1415 

CCT GCA TCT TGG TAT TTA CAG TAG AAA TTG TTA CGT GGA CCC TTA TTT 2592 
Pro Ala Ser Trp Tyr Leu Gin * Lys Leu Leu Arg Gly Pro Leu Phe 
35 1420 1425 1430 

GTT GCC TTT TGT GTT GCT CTA GGC AGT GAT AAG CCG AGG GGA GTA TGG 2 640 

Val Ala Phe Cys Val Ala Leu Gly Ser Asp Lys Pro Arg Gly Val Trp 

1435 1440 1445 

40 

CGT TCC GGC GCG TGG TAA CAA TTG CTG GCC TCA GAT GGC TGG CAT GAT 2 68 8 

Arg Ser Gly Ala Trp * Gin Leu Leu Ala Ser Asp Gly Trp His Asp 

1450 1455 1460 

45 CCC TCT TCC ATA TAG CAC GGT ATG CCT GAT TGC TGA AAA TAT TGG CTG 273 6 
Pro Ser Ser lie * His Gly Met Pro Asp Cys * Lys Tyr Trp Leu 
1465 1470 1475 1480 

CAT TTG TTT CTC TCT TTT TCT CAT ATT TTT CTC CTG TCT TTC ACT TGT 27 8 4 
50 His Leu Phe Leu Ser Phe Ser His lie Phe Leu Leu Ser Phe Thr Cys 

1485 1490 1495 

ACT ACA TTG CCT CAG ACA GTC ATG ATC AAA GAG AGC AGT GTC ATT AGA 2 83 2 
Thr Thr Leu Pro Gin Thr Val Met lie Lys Glu Ser Ser Val lie Arg 
55 1500 1505 1510 

CAT TTG TAG TTG TCT GCT GAC TTT GAC CAA AAC TTG TAA TTT ACT GTT 2 880 
His Leu * Leu Ser Ala Asp Phe Asp Gin Asn Leu * Phe Thr Val 
1515 1520 1525 



60 



GTT AAA GGT CCT TGA ATC ATA TTT TTT TAT AAT ATT ATG TTT GCA AGT 2 92 8 
Val Lys Gly Pro * lie lie Phe Phe Tyr Asn lie Met Phe Ala Ser 
1530 1535 1540 
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GGA AGT AAA GTG AAA TTG CAT CTA GTA TTT GTT GTT GCT GTC TTA GTC 297 6 
Gly Ser Lys Val Lys Leu His Leu Val Phe Val Val Ala Val Leu Val 
1545 1550 1555 1560 

5 GTT TAA TTG GAC ATG CAG TAA AAA GGT TTG CAT CTG CAG TTT GAT TGG 302 4 
Val * Leu Asp Met Gin * Lys Gly Leu His Leu Gin Phe Asp Trp 
1565 1570 1575 

GAA GGC GAC CTA CCT CTA AGA TAT CCT CAA AAG GAC CTG GTA ATA TAT 3 07 2 
10 Glu Gly Asp Leu Pro Leu Arg Tyr Pro Gin Lys Asp Leu Val lie Tyr 
1580 1585 1590 

GAG ATG CAC TTG CGT GGA TTC ACG AAG CAT GAT TCA AGC AAT GTA GAA 312 0 
Glu Met His Leu Arg Gly Phe Thr Lys His Asp Ser Ser Asn Val Glu 
15 1595 1600 1605 

CAT CCG GGT ACT TTC ATT GGA GCT GTG TCG AAG CTT GAC TAT TTG AAG 3168 

His Pro Gly Thr Phe lie Gly Ala Val Ser Lys Leu Asp Tyr Leu Lys 
1610 1615 1620 

20 

GTA CAG CTG TAC TTG CTG ACT ACA TAG GAT AAT TTT TAA AGA AAG CTA 3216 

Val Gin Leu Tyr Leu Leu Thr Thr * Asp Asn Phe * Arg Lys Leu 
1625 1630 1635 1640 

25 CAT ATT AGC CAG AAT TTG GGT TAT TAC AAA AAC TAC TGC ATA CTA TAG 32 64 
His lie Ser Gin Asn Leu Gly Tyr Tyr Lys Asn Tyr Cys lie Leu * 
1645 1650 1655 

CAG TTA CAT GCT CAT TAT CGA GGA GAT GCT CAC ACG CAT CTT ATT TGG 3 312 
3 0 Gin Leu His Ala His Tyr Arg Gly Asp Ala His Thr His Leu lie Trp 
1660 1665 1670 

ATT TAA TAC CCA ATT CTG TTT TGA TAT TGG ACT GTT CCC TCT ACA GGA 33 60 
lie * Tyr Pro lie Leu Phe * Tyr Trp Thr Val Pro Ser Thr Gly 
35 1675 1680 1685 

GCT TGG AGT TAA TTG TAT TGA ATT AAT GCC CTG CCA TGA GTT CAA CGA 3 408 

Ala Trp Ser * Leu Tyr * lie Asn Ala Leu Pro * Val Gin Arg 
1690 1695 1700 

40 

GCT GGA GTA CTC AAC CTC TTC TTC CAA GTA AGG ACA TGA ATT TAG TAT 3456 

Ala Gly Val Leu Asn Leu Phe Phe Gin Val Arg Thr * lie * Tyr 

1705 1710 1715 1720 

45 TAG CCT GCC AGC ACT GTT TGA GTG AGA GTT CAT ACA CAT TTT GTG CCT 3 504 
* Pro Ala Ser Thr Val * Val Arg Val His Thr His Phe Val Pro 
1725 1730 1735 

GCA TAA CTG ATA TTT GTT CAA ACT ATT TTT TTT AGC AGT CAC TCA ACA 3 552 
50 Ala * Leu lie Phe Val Gin Thr lie Phe Phe Ser Ser His Ser Thr 
1740 1745 1750 

GTT TTA CAT ATA TAT ATA ATA TAG ACT ATT CGT CAC CCT GGG TGA GGA 3 600 
Val Leu His lie Tyr lie lie * Thr lie Arg His Pro Gly * Gly 
55 1755 1760 1765 

ATA GTT ATT CTT CAC CCA CCT CTA TTT TAA CAT CTA TGC ACC GTA ATT 3648 
lie Val lie Leu His Pro Pro Leu Phe * His Leu Cys Thr Val lie 
1770 1775 1780 
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TTA CGT TTC GTA AAT TTG TCT TAT TTT AGA GAT AAA AAG AGA ACG TAA 3 696 
Leu Arg Phe Val Asn Leu Ser Tyr Phe Arg Asp Lys Lys Arg Thr * 
1785 1790 1795 1800 
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GAA AAC CTA TAA TCG TCG TAA AAA AAA ATA TGT TAC GTA AAA TTA CAA 37 44 
Glu Asn Leu * Ser Ser * Lys Lys lie Cys Tyr Val Lys Leu Gin 
1805 1810 1815 

5 ATG TAA AAA CAT AGT GTA AAA TGT AC A TAA AAT ACA TTT TTT GAC CTA 3792 
Met * Lys His Ser Val Lys Cys Thr * Asn Thr Phe Phe Asp Leu 
1820 1825 1830 

TAT TTT TTT TGT TAA TGC CAA ATT TTA TAC AGT AAA TCA ATA TGA ATG 3 840 
10 Tyr Phe Phe Cys * Cys Gin lie Leu Tyr Ser Lys Ser lie * Met 
1835 1840 1845 

TAA CTA TTT GTA TTT CAA ATG TAA TTT ATT TAT GAA ATG GTC GTA AGA 3888 

* Leu Phe Val Phe Gin Met * Phe lie Tyr Glu Met Val Val Arg 
15 1850 1855 1860 

TTA CCT CGG GTG AAG AAT AAC TTA TTC TGC ACC CTG GGT GAT GAA TAG 3 93 6 

Leu Pro Arg Val Lys Asn Asn Leu Phe Cys Thr Leu Gly Asp Glu * 
1865 1870 1875 1880 

20 

TAA CAC TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA TAT ATA CCG GCT 3 9 84 

* His Tyr lie Tyr lie Tyr lie Tyr lie Tyr lie Tyr lie Pro Ala 

1885 1890 1895 

2 5 GCT GCT AAT GAT GTT AAT ATT TCG CAA GTA CCT AAG CTG GAT TTT TCT 403 2 

Ala Ala Asn Asp Val Asn lie Ser Gin Val Pro Lys Leu Asp Phe Ser 
1900 1905 1910 

CCA TGA GAC ATC AAT CCA TAA TTG AAA TTG GTC ACG ACA GTT GAA TAG 4080 

3 0 Pro * Asp lie Asn Pro * Leu Lys Leu Val Thr Thr Val Glu * 

1915 1920 1925 

TTG ATA GCT GAA AAT GAA ATC CAG CAT GCT ACT GTC TTG CCA TCT CCA 412 8 
Leu lie Ala Glu Asn Glu lie Gin His Ala Thr Val Leu Pro Ser Pro 
35 1930 1935 1940 

GAC TTG CTA ACA TGA ATT TTG TCT GCC TAC CTG TCA TTT GTA CCA ACG 417 6 

Asp Leu Leu Thr * lie Leu Ser Ala Tyr Leu Ser Phe Val Pro Thr 
1945 1950 1955 1960 

40 

TTC CCA ATT GCC CTC TCA TTA TTC GTG TGT ACC ATG CAT ATG TGT TTT 422 4 

Phe Pro lie Ala Leu Ser Leu Phe Val Cys Thr Met His Met Cys Phe 
1965 1970 1975 

45 AAC ATG ATT ATT GTT GGC TAT ATT TCT CTT TGG AAA CAT GAC TAA TTT 4272 
Asn Met lie lie Val Gly Tyr lie Ser Leu Trp Lys His Asp * Phe 
1980 1985 1990 

ATC ACC CGT TTT GTA TAA ACT GCT TGT TTT CAT ATC AGG ATG AAC TTT 43 2 0 
50 lie Thr Arg Phe Val * Thr Ala Cys Phe His lie Arg Met Asn Phe 
1995 2000 2005 

TGG GGA TAT TCT ACC ATA AAC TTC TTT TCA CCA ATG ACG AGA TAC ACA 43 68 
Trp Gly Tyr Ser Thr lie Asn Phe Phe Ser Pro Met Thr Arg Tyr Thr 
55 2010 2015 2020 

TCA GGC GGG ATA AAA AAC TGT GGG CGT GAT GCC ATA AAT GAG TTC AAA 4416 
Ser Gly Gly lie Lys Asn Cys Gly Arg Asp Ala lie Asn Glu Phe Lys 
2025 2030 2035 2040 
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ACT TTT GTA AGA GAG GCT CAC AAA CGG GGA ATT GAG GTA AGC AAG TCG 4464 
Thr Phe Val Arg Glu Ala His Lys Arg Gly lie Glu Val Ser Lys Ser 
2045 2050 2055 
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TAC GAG TTA GTT GCT CCT TTT GAA CTT ATC AAT TTG ATG CGA AG A CAT 4 512 

Tyr Glu Leu Val Ala Pro Phe Glu Leu lie Asn Leu Met Arg Arg His 
2060 2065 2070 

5 GTT ACT GCT AGG TGA TCC TGG ATG TTG TCT TCA ACC ATA CAG CTG AGG 4 5 60 

Val Thr Ala Arg * Ser Trp Met Leu Ser Ser Thr lie Gin Leu Arg 
2075 2080 2085 

GTA ATG AGA ATG GTC CAA TAT TAT CAT TTA GGG GGG TCG ATA ATA CTA 4 608 

10 Val Met Arg Met Val Gin Tyr Tyr His Leu Gly Gly Ser lie lie Leu 
2090 2095 2100 

CAT ACT ATA TGC TTG CAC CCA AGG TGA CAG ' ATC TTT CTT GCT GCG TAA 4 656 

His Thr lie Cys Leu His Pro Arg * Gin lie Phe Leu Ala Ala * 

15 2105 2110 2115 2120 

TTG TTC TTT CAT AGA TGT ATA GAG CAT AGA TGT GTT ATG TAG TAG TTC 47 04 

Leu Phe Phe His Arg Cys lie Glu His Arg Cys Val Met * * Phe 
2125 2130 2135 



20 
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TTT TTC AAG GGG ATT ATG TTC ATG CAG GGA GAG TTT TAT AAC TAT TCT 47 52 
Phe Phe Lys Gly lie Met Phe Met Gin Gly Glu Phe Tyr Asn Tyr Ser 
2140 2145 2150 



2 5 GGC TGT GGG AAT ACC TTC AAC TGT AAT CAT CCT GTG GTT CGT CAA TTC 
Gly Cys Gly Asn Thr Phe Asn Cys Asn His Pro Val Val Arg Gin Phe 
2155 2160 2165 



4800 



ATT GTA GAT TGT TTA AGG TAC AGA TAT AC A TTT TAC TTC TAG AAC TAC 4848 
3 0 lie Val Asp Cys Leu Arg Tyr Arg Tyr Thr Phe Tyr Phe * Asn Tyr 
2170 2175 2180 

TTT TTC ATT TCT TTT GCT GCT TGT CAT TTT GAT ATG ATT AAT TTG CAA 4 896 
Phe Phe lie Ser Phe Ala Ala Cys His Phe Asp Met lie Asn Leu Gin 
35 2185 2190 2195 2200 

GCT TGT GGG GGT AAA TCT TTT GGT CAG CAT ATT GTA TCT TTA AAT GTC 4944 

Ala Cys Gly Gly Lys Ser Phe Gly Gin His He Val Ser Leu Asn Val 
2205 2210 2215 

40 

ACA AAT ACT AAT GTC CTG GTG CTT ATT GAT TTG GCA TCT TCA AAT TCT 4 9 92 

Thr Asn Thr Asn Val Leu Val Leu He Asp Leu Ala Ser Ser Asn Ser 
2220 2225 2230 

45 TCT CCA ATG AAA AGG GAA AAA TCT ACT GTA TGT CTC GTC AAC TAA TTT 5040 
Ser Pro Met Lys Arg Glu Lys Ser Thr Val Cys Leu Val Asn * Phe 
2235 2240 2245 

ACT TTT GTT TTG CAG ATA CTG GGT GAT GGA AAT GCA TGT TGA TGG TTT 50 8 8 
5 0 Thr Phe Val Leu Gin He Leu Gly Asp Gly Asn Ala Cys * Trp Phe 
2250 2255 2260 

TCG TTT TGA TCT TGC ATC CAT AAT GAC CAG AGG TTC CAG GTA ATT TGT 513 6 
Ser Phe * Ser Cys He His Asn Asp Gin Arg Phe Gin Val He Cys 
55 2265 2270 2275 2280 

ATT TAT TGT TTG TTT GCG TGT TGC CTT TTC AGA AGA TTC TTA AAA GAA 5184 
He Tyr Cys Leu Phe Ala Cys Cys Leu Phe Arg Arg Phe Leu Lys Glu 
2285 2290 2295 



TGT TTC TTT TAC AAG TCT GTG GGA TCC AGT TAA CGT GTA TGG AGC TCC 523 2 
Cys Phe Phe Tyr Lys Ser Val Gly Ser Ser * Arg Val Trp Ser Ser 
2300 2305 2310 
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AAT AGA AGG TGA CAT GAT CAC AAC AGG GAC ACC TCT TGT TAC TCC ACC 52 80 
Asn Arg Arg * His Asp His Asn Arg Asp Thr Ser Cys Tyr Ser Thr 
2315 2320 2325 

5 ACT TAT TGA CAT GAT CAG CAA TGA CCC AAT TCT TGG AGG CGT CAA GGT 532 8 
Thr Tyr * His Asp Gin Gin * Pro Asn Ser Trp Arg Arg Gin Gly 
2330 2335 2340 

ACT TGT TTC ATC CAA CAC CTG TTG TCT GTG TGC ATT CAA TTG TTT TAA 53 76 
10 Thr Cys Phe lie Gin His Leu Leu Ser Val Cys lie Gin Leu Phe * 

2345 2350 2355 2360 

TAT GGT AAT GAT CAA TTT CCC AAT GTT GAT AAG GAA AAA AAA TGC AAG 5424 
Tyr Gly Asn Asp Gin Phe Pro Asn Val Asp Lys Glu Lys Lys Cys Lys 
15 2365 2370 2375 

TAG CTC TCT TTA TCT GCT TCT TGT GAG TTA TGC TAA AC A TGT AGA TAC 5472 

* Leu Ser Leu Ser Ala Ser Cys Glu Leu Cys * Thr Cys Arg Tyr 
2380 2385 2390 

20 

TAC TAT ATT TCA ACT GTA TAT ACT TGA CAT ATT ATT GCT TCC TTG GGA 552 0 

Tyr Tyr lie Ser Thr Val Tyr Thr * His lie lie Ala Ser Leu Gly 

2395 2400 2405 

2 5 GGC TCT CTT ATT CCT TTC CCC CGT TGC AAT TAT AGC TCA TTG CTG AAG 5 568 

Gly Ser Leu lie Pro Phe Pro Arg Cys Asn Tyr Ser Ser Leu Leu Lys 
2410 2415 2420 

CAT GGG ATG CAG GAG GCC TCT ATC AAG TAG GTC AAT TCC CTC ACT GGA 5616 

3 0 His Gly Met Gin Glu Ala Ser lie Lys * Val Asn Ser Leu Thr Gly 

2425 2430 2435 2440 

ATG TTT GGT CTG AGT GGA ATG GGA AGG TAA GGT ACC TGT TAA AAG TTT 5664 
Met Phe Gly Leu Ser Gly Met. Gly Arg * Gly Thr Cys * Lys Phe 
35 2445 2450 2455 

GAA TGG CAA ATA CTG ATA GAA ATA TAA CTT ATA TTT GCG AC A TAT ATA 5712 

Glu Trp Gin lie Leu lie Glu lie * Leu lie Phe Ala Thr Tyr lie 
2460 2465 2470 

40 

GAT AAA GCA AAA TAA TAC GCA TTC CAC CTG AAC TTT AAA GGG GCA CGC 57 60 

Asp Lys Ala Lys * Tyr Ala Phe His Leu Asn Phe Lys Gly Ala Arg 
2475 2480 2485 

45 AGA ATT ATC CCG CAT CTG TCT ACA AGA ATG ATA ACA CAT GTG CTG AAT 5808 
Arg lie lie Pro His Leu Ser Thr Arg Met lie Thr His Val Leu Asn 
2490 2495 2500 

AGT GAA GTA CTA CTT CTC AAA TGT CTG AAT GAA CGC ACT AAC TCT TGT 5856 
50 Ser Glu Val Leu Leu Leu Lys Cys Leu Asn Glu Arg Thr Asn Ser Cys 
2505 2510 2515 2520 

GAG TGT CAA CCG AGC AAG AAA TAT TTG AGT TTT CTG CAA GAA ATT GTT 5904 
Glu Cys Gin Pro Ser Lys Lys Tyr Leu Ser Phe Leu Gin Glu lie Val 
55 2525 2530 2535 

CAT GTT GTG CTG TAT TAT ACT CCC TCC GTC CGA AAT TAT TTG TCG GAG 5 952 
His Val Val Leu Tyr Tyr Thr Pro Ser Val Arg Asn Tyr Leu Ser Glu 
2540 2545 2550 



60 



AAA TGG ATG TAT CTA GAC GTA TTT TAG TTC TAG ATA CAT CCA TTT TTA 6000 
Lys Trp Met Tyr Leu Asp Val Phe * Phe * He His Pro Phe Leu 
2555 2560 2565 



WO 99/14314 



U98/00743 



- Ill - 

TCC ATT TCT GCA ACA AGT AGT TCC GGA CGG AGG GAG TAT CAT TTA ACA 6048 

Ser lie Ser Ala Thr Ser Ser Ser Gly Arg Arg Glu Tyr His Leu Thr 

2570 2575 2580 

5 AAT ATA TGC ATG TTC GAA GTA AAT CCC CAC GAA TAA GCA TAT AAG ACG 6096 
Asn lie Cys Met Phe Glu Val Asn Pro His Glu * Ala Tyr Lys Thr 
2585 2590 2595 2600 

ATA TTG CTT TTT GAC TTG CAA CAC CTA AAC CTC ATT GTT TTC TCC TAG 6144 
10 lie Leu Leu Phe Asp Leu Gin His Leu Asn Leu lie Val Phe Ser * 

2605 2610 2615 

GAT TTT GGG TGT TCG AAG CAA GCA GCT GGT GAT ATT TAA TTT ACC TTT 6192 
Asp Phe Gly Cys Ser Lys Gin Ala Ala Gly Asp lie * Phe Thr Phe 
15 2620 2625 2630 

GCC TTT ATT TGT AGC TTG ATT TGA GGG TGC GGC AAA GGT TTT AGC TTA 6240 

Ala Phe lie Cys Ser Leu lie * Gly Cys Gly Lys Gly Phe Ser Leu 
2635 2640 2645 

20 

GTA GTG TTT TGT AAA TTA TTA TAG TTT ATG TAT ATA CTC CTC ATT TGG 62 88 

Val Val Phe Cys Lys Leu Leu * Phe Met Tyr lie Leu Leu lie Trp 
2650 2655 2660 

2 5 GCA CTT CCG TAC TGG TCC CAT AGA AGA TAA AAA TGG AAT GAT GTC TGG 63 3 6 

Ala Leu Pro Tyr Trp Ser His Arg Arg * Lys Trp Asn Asp Val Trp 
2665 2670 2675 2680 

CCA ATA ATT GTT GAC AAC ACT GTT GCG CAT TTG ATT TTT ATC AGG GAA 63 84 

3 0 Pro lie lie Val Asp Asn Thr Val Ala His Leu lie Phe lie Arg Glu 

2685 2690 2695 

TGG AAA ATT GAA ATC GGT AAG AAA CAT TGC GAT ATT AAG CTT GTA TAT 6432 
Trp Lys lie Glu lie Gly Lys Lys His Cys Asp lie Lys Leu Val Tyr 
35 2700 2705 2710 

GCT AAT GCT GGT GGA TCT TTA AGA GGG AAC ATA TGA TCT CGT GTG CAT 64 80 

Ala Asn Ala Gly Gly Ser Leu Arg Gly Asn lie * Ser Arg Val His 
2715 2720 2725 

40 

CCA TCT TCA ACT AAA AAA ATA TGT TGC ACA TCT CCC ACG TCA CTT ACT 652 8 

Pro Ser Ser Thr Lys Lys lie Cys Cys Thr Ser Pro Thr Ser Leu Thr 
2730 2735 2740 

45 AGC TAT TTC ATC CAA GTA CTA ACT TGT GTG GTT GTC TCC TCA GTA CCG 657 6 
Ser Tyr Phe lie Gin Val Leu Thr Cys Val Val Val Ser Ser Val Pro 
2745 2750 2755 2760 

GGA CAT TGT GCG CCA ATT CAT TAA AGG CAC TGA TGG ATT TGC TGG TGG 6624 
5 0 Gly His Cys Ala Pro lie His * Arg His * Trp lie Cys Trp Trp 

2765 2770 2775 

TTT TGC CGA ATG TCT TTG TGG AAG TCC ACA CCT ATA CCA GGT AAG TTG 6672 
Phe Cys Arg Met Ser Leu Trp Lys Ser Thr Pro lie Pro Gly Lys Leu 
55 2780 2785 2790 

TGG CAA TAC TTG GAA ATG GGT TGA GTG AAT GTC ACA TGG ATT TTT TAT 672 0 

Trp Gin Tyr Leu Glu Met Gly * Val Asn Val Thr Trp lie Phe Tyr 

2795 2800 2805 

60 

ATA TAC CAC ATG ATG ATA CAC ATG TAA ATA TAT AAC GAT TAT AGT GTA 67 6 8 

lie Tyr His Met Met lie His Met * lie Tyr Asn Asp Tyr Ser Val 

2810 2815 2820 
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TGC ATA TGC ATT TGG CTA AGA AGT ACT CCC TCC CTT AGT AAA AGT TAG 6816 
Cys lie Cys lie Trp Leu Arg Ser Thr Pro Ser Leu Ser Lys Ser * 
2825 2830 2835 2840 

5 TAC AAA GTT GAG TCA TCT ATT TTG GAA CGG AGG GAG TAT AAG TGT ATA 6864 
Tyr Lys Val Glu Ser Ser lie Leu Glu Arg Arg Glu Tyr Lys Cys lie 
2845 2850 2855 

CAC TAG TGC AAT ATA TAG GTT TTA ACA CCC AAC TTG CCA ATG AAG GAA 6912 
10 His * Cys Asn lie * Val Leu Thr Pro Asn Leu Pro Met Lys Glu 
2860 2865 2870 

CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GTC TGG TGA ATA ATC CAC 6960 
His Arg Ala Phe * Leu Ser Tyr Leu Phe Val Trp * lie lie His 
15 2875 2880 2885 

TGA AAA ATT CCA GCC ATG TCA TTT TTT AGG GGG GGA GAA GAA ACT ACA 7 008 

* Lys lie Pro Ala Met Ser Phe Phe Arg Gly Gly Glu Glu Thr Thr 
2890 2895 2900 

20 

TTG ATT TTT CCC CCT AAA AAA AGC CAT CTC AGA TTT CAT AGG TAA CTT 7056 

Leu lie Phe Pro Pro Lys Lys Ser His Leu Arg Phe His Arg * Leu 
2905 2910 2915 2920 

25 GCT TTT CTG TAA AGA AAT GAA AAC GAC TTC ATA CTT TCT GTC GAT TAT 7104 
Ala Phe Leu * Arg Asn Glu Asn Asp Phe lie Leu Ser Val Asp Tyr 
2925 2930 2935' 

AAG TGT ATA CAC TAG TGC AAT ATA TAG GTT TTA ACA CCC AAC TTG CCA 7152 
3 0 Lys Cys lie His * Cys Asn He * Val Leu Thr Pro Asn Leu Pro 
2940 2945 2950 

ATG AAG GAA CAT AGG GCT TTC TAG TTA TCT TAT TTA TTT GCT GGT GAA 72 00 
Met Lys Glu His Arg Ala Phe * Leu Ser Tyr Leu Phe Ala Gly Glu 
35 2955 2960 2965 

TAA TCC ACT GAA AAA TTC CAG CCA TGT CAT TTT TTA GGG GGG AGA AGA 724 8 

* Ser Thr Glu Lys Phe Gin Pro Cys His Phe Leu Gly Gly Arg Arg 
2970 2975 2980 

40 

AAC TAT ATT GAT TTT TCC CCC TAA AAA AAG CCA TCT CAG ATT CAT AGG 72 9 6 

Asn Tyr He Asp Phe Ser Pro * Lys Lys Pro Ser Gin He His Arg 
2985 2990 2995 3000 

45 AAC TTG CTT TTC TGT AAA GAA ATG AAA ACG ACT TCA TAC TTT CTG CGG 734 4 
Asn Leu Leu Phe Cys Lys Glu Met Lys Thr Thr Ser Tyr Phe Leu Arg 
3005 3010 3015 

CGC TTA CTT AGC TCG ATG GAT ATT TGT AAG ATG AAT GCC AAA TTA TTT 73 92 
50 Arg Leu Leu Ser Ser Met Asp lie Cys Lys Met Asn Ala Lys Leu Phe 
3020 3025 3030 

GGC GGG ATT TGA TCG TTA TTC CAA ATT TCA TTT GGT TTC TCT AGC AAT 7 440 
Gly Gly He * Ser Leu Phe Gin He Ser Phe Gly Phe Ser Ser Asn 
55 3035 3040 3045 

CAA CCC AGT ACC TTG TTA TTG GCA CTG CAA TTT CTT ATT GAT TAA TCA 74 88 
Gin Pro Ser Thr Leu Leu Leu Ala Leu Gin Phe Leu He Asp * Ser 
3050 3055 3060 



60 



GGC AGG AGG AAG GAA ACC TTG GCA CAG TAT CAA CTT GGT ATG TGC ACA 7 53 6 
Gly Arg Arg Lys Glu Thr Leu Ala Gin Tyr Gin Leu Gly Met Cys Thr 
3065 3070 3075 3080 
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TGA TGG ATT TAC ACT GGG TGA TTT GGT ACA TAT AAT ACC AAG TCA ATT 7 5 84 
* Trp lie Tyr Thr Gly * Phe Gly Thr Tyr Asn Thr Lys Ser lie 
3085 3090 3095 

5 TAC CAA ATG GGG AGA CCA ATA GAG ATG GAG AAA ATC ACA ATC TTA GCT 7 632 
Tyr Gin Met Gly Arg Pro lie Glu Met Glu Lys lie Thr lie Leu Ala 
3100 3105 3110 

GGA ATT GTG GGG AGG TAA TTC TGA ACT CTC CTT TTT TTT TGA AAT TTT 7 680 
10 Gly lie Val Gly Arg * Phe * Thr Leu Leu Phe Phe * Asn Phe 
3115 3120 3125 

CAT GCT TTA CAT AAT AGT CAA ATG GCT GAC AAA TGT CGT TGT ATG GTT 772 8 
His Ala Leu His Asn Ser Gin Met Ala Asp Lys Cys Arg Cys Met Val 
15 3130 3135 3140 

CTC TCT ACC TAA ACC GTT AAG GCA GTA AGA GTT TCC CTA CAA GAT CTC 77 7 6 

Leu Ser Thr * Thr Val Lys Ala Val Arg Val Ser Leu Gin Asp Leu 
3145 3150 3155 3160 

20 

TTT GTT CGT ATA ATT GTA TTT TCT AGA GAA AAG TTG CCT TCA ATT TTG 7 82 4 

Phe Val Arg lie lie Val Phe Ser Arg Glu Lys Leu Pro Ser lie Leu 
3165 3170 3175 

25 TGC ACG CGG CAG TAC AGG AAT TGT GGT TAT AAA TAT TGA TAC AGG CTG 7872 
Cys Thr Arg Gin Tyr Arg Asn Cys Gly Tyr Lys Tyr * Tyr Arg Leu 
3180 3185 3190 

ACC ATC GTT ACT AAT AGG GGG AAC AAT AAG CAC ATT TTT TTA ATA GCA 7 92 0 
3 0 Thr lie Val Thr Asn Arg Gly Asn Asn Lys His lie Phe Leu lie Ala 
3195 3200 3205 

AAG GCA TCA CCC TTG TTC CGT TTC CAA TGA AAT CAC AGT ATC CGA ACC 7968 
Lys Ala Ser Pro Leu Phe Arg Phe Gin * Asn His Ser lie Arg Thr 
35 3210 3215 3220 

ATA AGT TTT ACA AGT ATG CGT AGA GAG AAA TAA AGT ATC AAC CCG GCA 8 016 

lie Ser Phe Thr Ser Met Arg Arg Glu Lys * Ser lie Asn Pro Ala 
3225 3230 3235 3240 

40 

GAA ACA GTT GTT TCA GGC GCA AAG AGA AAA GGA AAC GAT ATG CTC TAT 8064 

Glu Thr Val Val Ser Gly Ala Lys Arg Lys Gly Asn Asp Met Leu Tyr 
3245 3250 3255 

45 TAC ATC AAC CTT TTA GCA TTT AGG GAC GAC CAG CAT CAT CCC ATC TTC 8112 
Tyr lie Asn Leu Leu Ala Phe Arg Asp Asp Gin His His Pro lie Phe 
3260 3265 3270 

AAT CAA CTG GAG CGA GGT CAC CTC CAA TCT TCT CAG CAG CCT CAG AGT 8160 
50 Asn Gin Leu Glu Arg Gly His Leu Gin Ser Ser Gin Gin Pro Gin Ser 
3275 3280 3285 

GGT GAC CTC CCA AGC AAG TGC ATC AGC ATC CAT CAT CTG GGG GTT GGG 8208 
Gly Asp Leu Pro Ser Lys Cys He Ser He His His Leu Gly Val Gly 
55 3290 3295 3300 

CAC ATA CCA TGA GCA CAA TCA CCT GAA TTT GAT GAA TTT TCC TCT GTT 82 5 6 
His He Pro * Ala Gin Ser Pro Glu Phe Asp Glu Phe Ser Ser Val 
3305 3310 3315 3320 
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TAC CTT GCA GCA GAC CCC TGC CGT ATA AAT GGT TTT AAA TGA CAG CAT 83 04 
Tyr Leu Ala Ala Asp Pro Cys Arg He Asn Gly Phe Lys * Gin His 
3325 3330 3335 



WO 99/14314 



PCT/AU98/00743 



- 114 - 

GTT CTT TCA GTT TGA GCA AAA TTT GTG CAA TTG CAA AGA AGC TTT AGA 83 52 

Val Leu Ser Val * Ala Lys Phe Val Gin Leu Gin Arg Ser Phe Arg 
3340 3345 3350 

5 ATC ATG TGG AAC ATG CAC TTA CAT TTC ATC TGA CAA TAT AGG AAG GAG 8400 
lie Met Trp Asn Met His Leu His Phe lie * Gin Tyr Arg Lys Glu 
3355 3360 3365 

AGC CCG ACG TCG CAT GCT CCT CTA GAC TCG AGG AAT TCG CAA GAT TGT 8448 
10 Ser Pro Thr Ser His Ala Pro Leu Asp Ser Arg Asn Ser Gin Asp Cys 
3370 3375 3380 

CTG TCA AAA GAT TGA GGA AGA GGC AGA TGC GCA ATT TCT TTG TTT GTC 849 6 
Leu Ser Lys Asp * Gly Arg Gly Arg Cys Ala lie Ser Leu Phe Val 
15 3385 3390 3395 3400 

TCA TGG TTT CTC AAG TAA GAC TTA TAT CTG ATC TCT TCA ATT TTT GAG 8544 

Ser Trp Phe Leu Lys * Asp Leu Tyr Leu lie Ser Ser lie Phe Glu 

3405 3410 3415 

20 

ATT GCC TGT TTT TCA CAA TGG CAT ATG TTG TCA GGT GAA ACA TCC AAT 8592 

lie Ala Cys Phe Ser Gin Trp His Met Leu Ser Gly Glu Thr Ser Asn 

3420 3425 3430 

2 5 CCC AGT ATT AAT AGA GCC AAC ATG AAG GGA TTG CTT ATC TGA GAT ATC 8640 

Pro Ser lie Asn Arg Ala Asn Met Lys Gly Leu Leu lie * Asp lie 
3435 3440 3445 

TGC CAA AGT TGA ATT CTT AGA TTC ACC TTC TTC AGT ATT TCA GAC CTT 8688 

3 0 Cys Gin Ser * lie Leu Arg Phe Thr Phe Phe Ser lie Ser Asp Leu 

3450 3455 3460 

CTA AGC ATT TTC ATT TTT TTT TTC AAT TGT TAG GGA GTT CCA ATG TTT 873 6 
Leu Ser lie Phe lie Phe Phe Phe Asn Cys * Gly Val Pro Met Phe 
35 3465 3470 3475 3480 

TAC ATG GGC GAT GAA TAT GGC CAC ACA AAA GGG GGC AAC AAC AAT ACA 87 84 

Tyr Met Gly Asp Glu Tyr Gly His Thr Lys Gly Gly Asn Asn Asn Thr 
3485 3490 3495 

40 

TAC TGC CAT GAT TCT TAT GTC AGT ACA ATT TGG TCA CAT ATT GTT GTT 8 832 

Tyr Cys His Asp Ser Tyr Val Ser Thr lie Trp Ser His lie Val Val 
3500 3505 3510 

45 CTA AGT AAC TAT CTT CAA ATC TTT GCA TTC ATC CGT CAT GGC TCT TCT 8880 
Leu Ser Asn Tyr Leu Gin lie Phe Ala Phe lie Arg His Gly Ser Ser 
3515 3520 3525 

GTA GGT CAA TTA TTT TCG CTG GGA TAA AAA AGA ACA ATA CTC TGA CTT 892 8 
50 Val Gly Gin Leu Phe Ser Leu Gly * Lys Arg Thr lie Leu * Leu 
3530 3535 3540 

GCA AAG ATT CTG CTG CCT CAT GAC CAA ATT CCG CAA GTA AGT ATT CCG 897 6 
Ala Lys lie Leu Leu Pro His Asp Gin lie Pro Gin Val Ser lie Pro 
55 3545 3550 3555 3560 

TTG AAT AAT TTC TGT GTA GAA CCA CTG AAG GTG CCT CCA AAC GCT AAG 902 4 
Leu Asn Asn Phe Cys Val Glu Pro Leu Lys Val Pro Pro Asn Ala Lys 
3565 3570 3575 



60 



CGA GCA AGG TCA ATT TCA CAC CCT AAT CAA GTT GGT GTT GTC TAT TTG 9072 
Arg Ala Arg Ser lie Ser His Pro Asn Gin Val Gly Val Val Tyr Leu 
3580 3585 359.0 
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TGT ATT TGA TCT GCT GCA CTG TAG GGA GTG CGA GGG TCT TGG CCT TGA 9120 

Cys lie * Ser Ala Ala Leu * Gly Val Arg Gly Ser Trp Pro * 

3595 3600 3605 

5 GGA CTT TCC AAC GGC CGA ACG GCT GCA GTG GCA TGG TCA TCA GCC TGG 9168 
Gly Leu Ser Asn Gly Arg Thr Ala Ala Val Ala Trp Ser Ser Ala Trp 
3610 3615 3620 

GAA GCC TGA TTG GTC TGA GAA TAG CCG ATT CGT TGC CTT TTC CAT GGT 9216 
10 Glu Ala * Leu Val * Glu * Pro lie Arg Cys Leu Phe His Gly 
3625 3630 3635 3640 

ACA CAT ATA GTT CTG ACA CTT CAC TAT AGT TGT TTT AAA AAA GAA AAT 92 64 
Thr His lie Val Leu Thr Leu His Tyr Ser Cys Phe Lys Lys Glu Asn 
15 3645 3650 3655 

TTA ACT CAA AAG TAA ATT ATG GAG A 92 89 

Leu Thr Gin Lys * lie Met Glu 
3660 

20 
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CLAIMS 

1- A nucleic acid sequence encoding an enzyme of the 

starch biosynthetic pathway in a cereal plant, wherein the 
5 enzyme is selected from the group consisting of starch 
branching enzyme I, starch branching enzyme II, starch 
soluble synthase I, and debranching enzyme, with the proviso 
that the enzyme is not soluble starch synthase I of rice, or 
starch branching enzyme I of rice or maize. 

10 

2. A sequence according to claim 1, wherein the 
sequence is a genomic DNA or cDNA sequence. 

3. A sequence according to claim 1 or claim 2, 
15 wherein the sequence is functional in wheat. 

4. A sequence according to any one of claims 1 to 3 , 
wherein the sequence is derived from a Triticum species. 

20 5. A sequence according to claim 4, wherein the 

Triticum species is Triticum tauschii . 

6. A sequence according to any one of claims 1 to 5 , 

wherein the sequence encodes starch branching enzyme I or a 
25 biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 5 or SEQ ID NO: 9. 

7- A sequence according to claim 6, wherein the 

30 homology is at least 90%. 

8. A sequence according to any one of claims 1 to 5 , 

wherein the sequence encodes starch branching enzyme II a or 
biologically-active fragment thereof, and wherein the 
3 5 sequence has at least 7 0% sequence homology with the 
sequence shown in SEQ ID NO: 10. 
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9- A sequence according to claim 8, wherein the 
homology is at least 90%. 

10- A sequence according to any one of claims 1 to 5 , 
5 wherein the sequence encodes soluble starch synthase or a 

biologically-active fragment thereof, and wherein the 
sequence has at least 70% sequence homology with the 
sequence shown in SEQ ID NO: 11 or SEQ ID NO: 13. 

10 11. A sequence according to claim 10, wherein the 

homology is at least 90%. 

12. A sequence according to claim 11, wherein the 
sequence encodes a 75 kD soluble starch synthase of wheat. 

15 

13. A sequence according to claim 12, which encodes an 
amino acid sequence at least 7 0% homologous to that shown in 
SEQ ID NO: 14 . 

20 14. A sequence according to any one of claims 1 to 5 , 

wherein the sequence encodes debranching enzyme or a 
biologically-active fragment thereof, and wherein the 
sequence has at least 7 0% sequence homology with the 
sequence shown in SEQ ID No: 17. 



25 



30 



35 



15. A sequence according to claim 14, wherein the 
homology is at least 90%. 

16. A promoter of an enzyme selected from the group 
consisting of starch branching enzyme I, starch branching 
enzyme II, starch soluble synthase I, and debranching 
enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize. 

17. A promoter according to claim 16, wherein the 
promoter is a starch branching enzyme I promoter or 
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biologically-active fragment thereof, and wherein the 
promoter sequence has at least 7 0% sequence homology with 
the sequence shown in SEQ ID No: 8. 

5 18. A sequence according to claim 17, wherein the 

homology is at least 90%. 

19. A promoter according to claim 16, wherein the 
promoter is a starch soluble synthase I promoter or 

10 biologically-active fragment thereof, and wherein the 

promoter sequence has at least 7 0% sequence homology with 
the sequence shown in SEQ ID No: 15. 

20. A sequence according to claim 19, wherein the 
15 homology is at least 90%. 

21. A nucleic acid construct comprising a nucleic acid 
sequence encoding an enzyme of the starch biosynthetic 
pathway in a cereal plant, operably linked to one or more 

20 nucleic acid sequences facilitating expression of the 

nucleic acid sequence in a plant, wherein the enzyme is 
selected from the group consisting of starch branching 
enzyme I, starch branching enzyme II, starch soluble 
synthase I, and debranching enzyme, with the proviso that 

25 the enzyme is not soluble starch synthase I of rice, or 

starch branching enzyme I of rice or maize, a biologically- 
active fragment thereof. 

22. A nucleic acid construct for targeting a gene to 
3 0 the endosperm of a cereal plant, comprising one or more 

promoter sequences selected from the group consisting of 
SBE I promoter, SBE II promoter, SSS I promoter, and 
DBE promoter, operatively linked to a nucleic acid sequence 
encoding a protein, wherein the expression of the targetted 
35 gene in the endosperm of a cereal plant is modified. 
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23. A construct according to either claim 21 or claim 
22, wherein the promoter or nucleic acid sequence is also 
operatively linked to one or more additional targeting 
sequences and/or one or more 3 1 untranslated sequences. 

5 

24. A construct according to claim 23, wherein the 
nucleic acid encoding the protein is either in the sense or 
antisense orientation . 

10 25. A construct according to claims 24, wherein the 

protein is an enzyme of the starch biosynthetic pathway. 

26. A construct according to claim 25, wherein the 
nucleic acid encoding the protein is in the antisense 

15 orientation, and the enzyme is selected from the group 

consisting of GBSS, starch debranching enzyme, SBE II, low 
molecular weight glutenin, and grain softness protein I. 

27. A construct according to claim 25, wherein the 
20 nucleic acid encoding the protein is in the sense 

orientation, and the enzyme is selected from the group 
consisting of bacterial isoamylase, bacterial glycogen 
synthase, and wheat high molecular weight glutenin Bxl7 . 

28. A construct according to any one of claims 21 to 
25 27, wherein the plant is a cereal plant. 

29. A construct according to claim 28, wherein the 
cereal plant is either wheat or barley. 

30 30. A construct according to claim 29, wherein the 

cereal plant is wheat. 

31. A construct according to any one of claims 21 to 

30. wherein the construct is either a plasmid or a vector. 



35 
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32. A construct according to claim 31, wherein the 

plasmid or vector is suitable for use in the transformation 
of a plant . 

5 33 . A construct according to claim 32, wherein the 

plasmid is selected from the group consisting of those 
depicted in Figures 22a to 22f. 

34. A construct according to claim 32, wherein the 
10 vector is a bacterium of the genus Agrobacterium. 

35. A construct according to claim 34, wherein the 
vector is Agrrojbacterium tumefaciens . 

15 36. A method of modifying the characteristics of 

starch produced by a plant, comprising the steps of: 

(a) introducing a nucleic acid sequence encoding 
an enzyme of the starch biosynthetic pathway into a host 
plant, and/or 

20 (b) introducing an anti-sense nucleic acid 

sequence directed to a gene encoding an enzyme of the starch 
biosynthetic pathway into a host plant, 

wherein the enzyme is selected from the group 
consisting of starch branching enzyme I, starch branching 

25 enzyme II, starch soluble synthase I, and debranching 

enzyme, with the proviso that the enzyme is not soluble 
starch synthase I of rice, or starch branching enzyme I of 
rice or maize, and wherein if both steps (a) and (b) are 
used, the enzymes in the two steps are different. 

30 

37. A method according to claim 36, wherein the plant 
is a cereal plant. 

38. A method according to claim 37, wherein the cereal 
3 5 plant is wheat or barley. 
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39. A method of targeting expression of a gene to the 
endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

5 

40. A method of modulating the time of expression of a 
gene in endosperm of a cereal plant, comprising the step of 
transforming the plant with a construct according to any one 
of claims 21 to 35. 

10 

41. A method according to claim 40, wherein when 
expression at an early stage following anthesis is desired, 
the construct comprises either the SBE II, SSS I, or DBE 
promoter . 

15 

42. A method according to claim 40, wherein when 
expression at a later stage following anthesis is desired, 
the construct comprises the SBE I promoter. 

20 43. A plant transformed with a construct according to 

any one of claims 21 to 35. 

44. A plant according to claim 43, wherein the plant 
is a cereal plant. 

25 

45. A plant according to claim 44, wherein the cereal 
plant is wheat or barley. 

46. A method of identifying variations in the starch 
30 synthesis characteristics of a cereal plant, comprising the 

step of identifying a variation in nucleic acid sequence in 
the intron regions of the SBE I, SBE II, SSS I or DBE genes. 

47 . A method of identifying variations in the starch 

35 synthesis characteristics of a cereal plant, comprising the 
step of identifying a variation in nucleic acid sequence 
compared to the sequence shown in one or more SEQ ID NO : 5 , 
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SEQ ID NO: 7, SEQ ID NO : 9 , SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID 
NO: 13, SEQ ID NO: 15, SEQ ID NO: 16, or SEQ ID NO: 17. 

48. A method according to claim 47, in which a 

5 mutation or absence of a SBE I, SBE II, SSS I or DBE gene is 
detected. 

49. A method according to either claim 47 or claim 48, 
in which the cereal plant is wheat or barley. 

10 50. A product comprising plant material propogated 

from a plant transformed with a nucleic acid sequence 
encoding an enzyme of. the starch biosynthetic pathway in a 
cereal plant, operably linked to one or more nucleic acid 
sequences facilitating expression of the nucleic acid 

15 sequence in a plant, wherein the enzyme is selected from the 
group consisting of starch branching enzyme I, starch 
branching enzyme II, starch soluble synthase I, and 
debranching enzyme, with the proviso that the enzyme is not 
soluble starch synthase I of rice, or starch branching 

2 0 enzyme I of rice or maize, a biologically-active fragment 
thereof . 

51. A product comprising plant material propogated 
from a plant in which a gene was targeted to the endosperm 
of a cereal plant, by a nucleic acid construct comprising 

2 5 one or more promoter sequences selected from the group 

consisting of SBE I promoter, SBE II promoter, SSS I 
promoter, and DBE promoter, operatively linked to a nucleic 
acid sequence encoding a protein, wherein the expression of 
the targetted gene in the endosperm of a cereal plant is 

3 0 modified. 

52. A product according to claim 50 or claim 51 
wherein the product is a food product . 
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SBE II Intron 10 primer set - digested with Dde1 
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